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Abstract 


This  thesis  introduces  a  new  calculus  for  manipulating  linear-program  decomposition 
schemes.  A  linear  program  is  represented  by  a  communication  network ,  which  is 
decomposed  by  splitting  nodes  in  two,  and  a  transformation  is  defined  to  recover 
subproblcms  from  the  network.  We  also  define  a  dual-symmetric  oracle  that  provides 
solutions  to  linear  programs,  and  can  be  performed  by  the  simplex  method,  nested 
decomposition,  and  finally,  parallel  decomposition. 


Two  important  classes  of  linear  program  serve  as  examples  for  the  above  calculus: 
staircase  linear  programs  and  stochastic  linear  programs.  For  the  former  case,  a 
sophisticated  yet  experimental  computer  code  has  been  written  for  an  IBM  3Q90/G0QE 
with  six  processors.  The  code  performs  the  parallel  decomposition  algorithm  and  is 
tested  on  twenty-two  small  to  medium  sized  ^real-world"  problems.  Experiments 
show  that  in  addition  to  specdups  provided  by  decomposition  alone,  performance  is 
improved  by  using  parallel  processors.  /* ^  J  C _ 
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Notation 


Math  Symbols 

Ui  The  ordered  set  of  real  numbers. 

0  The  empty  set. 

V  Means  “for  all”. 

6  Means  “is  an  element  of". 

D  Means  “contains  the  set", 
fl  Intersection  operation  on  sets. 

U  Union  operation  on  sets, 
c  A  column  vector  of  ones. 

(•)D  The  dual  form  of  the  linear  program  in  the  equation  (•). 

B  A  binary  operator  th  ‘  partitions  the  rows  of  a  linear  program. 

03  A  binary  opt  ator  that  partitions  the  columns  of  a  linear  program. 
□  The  inverse  of  the*  row  and  column  partition  operators. 

I  End  of  proof. 

D  End  of  example. 
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Variables  and  Index  Sets 

All  variables  serve  both  as  vectors  in  multidimensional  real  space  and  as  sets  of  indices. 
The  primal  variables  index  the  columns  of  the  matrix  A  and  row  vector  cT,  and  the 
dual  variables  index  the  rows  of  A  and  the  column  vector  b.  The  context  will  make 
dear  whether  a  character  such  as  x  represents  a  real  value  or  an  index  to  a  column. 

Two  types  of  variable  arc  present  in  the  formulation  of  a  decomposition  subprob¬ 
lem:  original  variables  from  the  original  problem  and  added  variables  for  the  purpose 
of  appending  and  modifying  the  original  ones. 

The  primal  and  dual  variables  are  named  with  corresponding  Roman  and  Greek 
characters.  Even  the  functions  of  the  characters  as  index  sets  and  variables  bear 
symmetric  interpretations. 

Index  only 
»,  k  Row  indices, 
j,  /  Column  indices. 
a  An  index  for  the  objective  row. 

3  An  index  for  the  right-hand  side. 

Dual  Variable  or  Row  Index 

A  An  added  dual  variable  on  new  constraint;). 
v  An  added  dual  variable  on  the  objective  modification  constraint, 
x  Original  dual  variables. 

ij>  An  added  dual  variable  on  column  accounting  constraints. 

An  added  dual  variable  on  the  right-hand  side  modification  constraint. 

0  An  added  dual  variable  on  primal  convexity  constraints. 


x 


* 
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Primal  Variable  or  Column  Index 

/  An  added  primal  variable  to  combine  new  columns, 
u  An  added  primal  variable  to  implement  a  right-hand  side  modification. 
x  The  original  primal  variables. 

y  An  added  primal  variable  to  account  passed  primal  solutions. 
w  An  added  primal  variable  to  account  the  objective  modification. 
t  An  added  primal  variable  on  dual  convexity  constraints. 

Variable  only 

a  A  non-negative  scalar, 
z  An  objective  value. 

Sets 

A  A  closed  polyhedral  set  representing  a  primal  feasible  region  (in  context). 
B  A  closed  polyhedral  set  representing  a  dual  feasible  region. 

C  A  set  containing  column  indices. 

Qn  The  set  containing  all  communication  networks  with  N  nodes. 

V  A  set  containing  dual  extreme  points. 

V  A  set  containing  dual  extreme  rays. 

72.  A  set  containing  row  indices. 

V  A  set  containing  primal  extreme  points. 

V  A  set  containing  primal  extreme  rays. 
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Data  Structures 


Original  Data 

A  Constraint  coefficient  matrix. 

6  Right-hand  side  vector, 
c  Vector  of  costs. 

Added  Data  in  Real  Space 

II  An  added  data  structure  to  contain  extra  constraints. 

X  An  added  data  structure  to  hold  extra  columns. 

0  An  added  data  structure  for  modifying  an  objective  function. 

y  An  added  data  structure  for  modifying  the  right-hand  side. 

0  An  added  data  structure  containing  the  slope  of  the  dual  objective  function  in 
a  dual  extreme  ray  direction. 

i  An  added  data  structure  containing  the  slope  of  the  primal  objective  function 
in  an  extreme  ray  direction. 

Added  Data  in  Binary  Space 

8  A  binary  scalar  indicating  a  dual  extreme  point  in 
d  A  binary  scalar  indicating  a  primal  extreme  point  in  y. 

7  An  binary  vector  indicating  a  corresponding  dual  extreme  points  in  ft. 

</  An  binary  vector  indicating  a  corrcr, ponding  primal  extieme  point  in  X. 

Ia  Subproblem  interface  matrix  for  arc  a  containing  at  most  one  unit  entry  per 
row  and  column. 

xii 


Subscripts,  Superscripts  and  Accents 

The  subscript  n  denotes  information  for  node  n,  while  the  subscript  a  denotes  infor¬ 
mation  for  arc  a.  Various  math  accents  arc  used  on  bnth  primal  and  dual  variables 
throughout.  The  tilde  accent,  as  in  z,  indicates  a  general  solution  value.  The  arrow 
accent,  as  in  z,  indicates  an  extreme  ray  solution  value.  The  dot  accent,  as  in  z, 
indicates  an  extreme  point  solution  value. 

Aij  The  ijlh  element  of  the  matrix  A. 

bt  The  fth  element  of  column  vector  b. 

cj  The  jtii  element  of  row  vector  c. 

z*;  The  ith  element  of  the  Jfcth  primal  solution  zn  for  node  n. 

The  jth  element  of  kill  dual  solution  for  node  u. 

J'j  The  tjth  element  of  the  matrix  la 

Dimensions 

p  The  number  of  processors  (in  context). 

N  The  number  of  subproblcms. 

K  The  number  of  times  a  given  subproblcm  has  been  solved. 

r„  The  number  of  coupling  rows  between  the  subproblems  connected  by  arc  c. 

rn  The  number  of  non-zero  rows  in  the  column  partition  of  subproblcm  n. 

Cn  The  number  of  columns  in  the  partition  for  subproblcm  n. 

en  The  number  of  non-zero  elements  in  the  column  partition  of  subproblem  n. 

rn  The  number  of  rows  in  the  formulation  of  subproblem  n. 

xiii 


c,,  The  number  of  columns  in  the  formulation  of  subproblem  n. 

c„  The  number  of  non-zeros  in  the  formulation  of  subproblcm  n. 

iV  The  maximum  number  of  subproblcms  handled  by  the  code. 

p  The  maximum  number  of  processors  handled  by  the  code. 

ra  The  maximum  number  of  coupling  constraints  between  adjacent  subproblems 
that  can  be  handled  by  the  code. 

Graph  Theory 

jV  The  set  of  nodes  in  a  graph. 
n  A  node  in  M. 

A  The  set  of  arcs  in  a  graph  (in  context). 
a  An  arc  in  A. 

Ta  The  type  for  arc  a  (up,  down,  left,  or  right). 
g  A  communication  graph  (when  not  subscripted). 
h  An  incidence  graph. 
p  A  partition  graph  (in  context). 

V  The  set  of  all  partition  graphs  (in  context). 

Problems  and  Solution  Methods 

NAME /n/p  Problem  NAME  divided  into  n  subproblerns  and  solved  using  p  proces¬ 
sors. 

A  LG/p  Algorithm  ALG  is  run  using  p  processors. 


Multiple  Meanings 


The  characters  r,  c,  c,  p,  A ,  and  V  can  have  multiple  meanings.  The  first  three  arc 
redefined  in  Chapter  Four  to  refer  to  row,  column  and  element  dimensions  of  LPs.  In 
Chapter  Three,  p  is  introduced  as  a  partition  graph,  while  in  Chapter  Four  it  refers 
to  the  number  of  computer  processors  applied  to  solving  a  test  problem.  Early  in 
Chapter  One,  A  refers  to  a  primal  feasible  region,  while  later  it  is  used  as  the  set  of 
arcs  in  a  communication  network.  Finally,  in  Chapter  One,  V%  when  accented  with 
an  arrow  or  dot,  is  a  set  of  primal  extreme  rays  or  extreme  points,  but  in  Chapter 
Three,  V  is  used  exclusively  as  the  set  of  all  partition  graphs  in  a  communication 
network. 


Chapter  1 

Symbolic  Decomposition 


DESCRIBED  herein  is  a  methodology  by  which  Linear  Programs  (LPs)  can 
be  decomposed  into  a  collection  of  interdependent  LPs  and  solved  with  a 
decomposition  algorithm  on  a  parallel  computer.  Equation  (1.1)  introduces 
the  notation  used  throughout  for  linear  program  formulations: 


min  cTx  =  z 

x>0 

s.t.  r :  Ax>b. 


(1.1) 


Corresponding  to  the  constraints  Ax  >  b  are  dual  variables  tt.  The  positioning  of 
the  dual  variables  to  the  left  in  (1.1)  defines  the  correspondence  between  the  slacks 
of  the  primal  constraints  and  the  dual  variables.  An  analogous  correspondence  exists 
between  the  slacks  of  the  dual  constraints  (reduced  costs)  and  the  primal  variables. 

Two  important  classes  of  problem  will  serve  as  guinea  pigs  to  be  dissected.  Their 
anatomies  arc  displayed  in  Figure  1.1.  The  dissection  proceeds  as  a  series  of  bisections 
or  slices  through  the  rows  and  columns  of  /l,  corresponding  to  a  series  of  partitions 
of  its  row  and  column  index  sets.  In  the  figure,  the  gray  submatrices  are  where  the 
nonzero  coefficients  are  located,  and  the  heavy  lines  with  numerals  1,  2,  3,  are  the 
slices  and  the  order  in  which  they  are  made.  Appendix  B  contains  a  collection  of  such 
nonzero  coefficient  patterns  for  a  number  of  real-world  staircase  problems. 
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CHAPTER  1.  SYMBOLIC  DECOMPOSITION 


1  2  3  2  3  4 

Staircase  Two-Stage  Stochastic 

Linear  Program  Linear  Program 

Figure  1.1:  Partitioning  variables  and  constraints. 

The  Staircase  LP  in  Figure  1.1  has  its  column  index  set  C  partitioned  into  four 
sets  by  three  slices.  The  first  one  slices  ofT  the  leftmost  block  of  nonzeros  and  the 
next  two  in  turn  slice  off  the  remaining  blocks  in  the  same  way.  In  a  different  manner, 
the  row  index  set  of  the  Two-Stage  Stochastic  LP  is  first  partitioned  into  two  sets, 
a  top  and  a  bottom;  then  the  set  associated  with  the  bottom  blocks  of  nonzeros 
is  partitioned  in  a  fashion  similar  to  that  of  the  Staircase  LP.  These  two  classes  of 
linear  program  have  many  practical  applications.  There  is  an  extensive  literature  on 
exploiting  their  special  structure  in  order  to  develop  an  efficient  solution  algorithm; 
for  example:  [Dan59],  [ZadG2],  [DGDC  l],  [VSlM],  [Gla71],  (IIo74),  [DGSS]. 

The  title  of  this  thesis,  The  Parallel  Decomposition  of  Linear  Programs,  means 
that  these  structures  and  others  can  be  further  exploited  if  the  LP’s  arc  solved  using 
parallel  computers.  The  collection  of  interdependent  LPs  (subproblcms)  resulting 
from  the  decomposition  prescribed  above  can  be  solved  asynchronously  on  a  paral¬ 
lel  computer.  Recalling  that  decomposition  algorithms  are  iterative,  we  will  show 
that  the  corresponding  subproblcms  can  be  solved  repeatedly,  with  information  being 
passed  from  one  to  another  until  convergence  is  reached.  Moreover,  with  a  parallel 
computer  we  can  solve  these  subproblems  simultaneously  with  all  processors  efficiently 
employed,  thus  obtaining  the  overall  solution  more  quickly. 
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The  main  contribution  of  this  thesis  involves  developing  a  “symbolic  calculus"  for 
partitioning  linear  programs  and  demonstrating  its  usefulness  on  practical  LP  exam¬ 
ples.  Finally,  we  show  that  a  paullcl  decomposition  algorithm  can  indeed  outperform 
serial  algorithms,  by  experimenting  with  a  computer  code  designed  to  solve  staircase 
LPs. 


Figure  1.2:  Symbolic  Decomposition  covered  by  Chapter  One. 


Figure  1.2  outlines  the  derivation  of  our  symbolic  decomposition  calculus  and  its 
role  in  producing  a  system  of  subproblems.  The  left  side  of  the  diagram  represents  the 
traditional  algebraic  derivations  of  subproblem  formulations.  We  propose  a  transform 
to  a  symbolic  space  that  is  based  on  network  theory  and  we  call  communication 
networks.  The  symbol  GN  represents  the  collection  of  all  such  networks  on  N  nodes. 
In  place  of  algebra,  we  define  simple  operators  on  the  network  that  effect  horizontal 
and  vertical  slices  decomposition  in  ever  more  complex  schemes.  Finally,  in  Chapter  3 
we  provide  a  generalized  parallel  algorithm  based  on  some  given  network  in  £?'v.  This 
algorithm  is  a  generalization  of  nested  decomposition  [IIo74,  AbrS3]  and  through 
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experiments  on  twenty-two  stairense-LP  test  problems  we  show  that  decomposition 
algorithms  can  be  sped  up  by  parallel  computers. 

Given  a  dissection  of  the  anatomy  of  a  particular  LP,  like  those  in  Figure  1.1, 
we  can  formulate  certain  well  defined  subprobiems  and  a  well  defined  algorithm  to 
modify  and  solve  the  subprobiems,  thereby  arriving  at  a  solution  to  the  original  LP. 
We  call  the  entire  process  symbolic  decomposition  because  it  is  concerned  not  with 
actual  data  values  but  with  the  relationships  between  them  (as  necessary  to  solve 
the  problem).  The  symbolic  calculus  we  will  describe  allows  for  the  possibility,  if 
desired,  of  refining  a  dissection  to  the  point  where  the  individual  blocks  consist  of 
single  coefficients  of  the  matrix  A.  Using  this  calculus,  we  can  conveniently  partition 
the  blocks  of  a  large-scale  LP  to  exploit  many  different  underlying  patterns  found  in 
real-world  problems. 

Chapter  One  reviews  the  theory  of  decomposition  by  Goldman,  Dantzig  and  Wolfe, 
and  Benders,  and  introduces  symbolic  decomposition.  It  concludes  with  a  theorem 
on  subproblcm  interactions. 

Decomposition,  as  described  by  GcofTrion,  involves  either  some  kind  of  restriction 
or  some  kind  of  relaxation  of  the  original  problem  (Gco70).  Considerable  advantage 
can  be  gained  when  the  restriction  or  relaxation  results  in  a  much  simpler  problem. 
This  is  especially  true  when  the  original  problem  size  is  so  large  it  would  overwhelm 
the  computer.  The  full  problem  can  be  broken  into  many  smaller  ones  that  can  be 
solved  to  obtain  an  overall  solution.  This  is  decomposition. 

All  LP  decomposition  algorithms  arc  based  on  two  well  known  theorems.  The  first 
is  the  Goldman  Resolution  Theorem  [Gol56],  which  states  that  a  convex  polyhedron 
can  be  described  as  a  convex  combination  of  its  extreme  points  (provided  such  exist) 
plus  a  non-negative  combination  of  its  extreme  rays  (when  not  bounded).  The  second 
is  that  the  solution  of  a  linear  program  solved  by  the  simplex  method  (Dan63)  (whether 
primal  or  dual)  is  always  at  an  extreme  point  (and/or  an  extreme  ray). 

nhere  are  two  fundamental  methods  of  decomposing  a  linear  program  into  a  collec¬ 
tion  of  LP  subproblems;  the  Dantzig- Wolfe  method  [DW61]  and  the  Benders  method 
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[I3cnG2].  They  are  the  duals  of  each  other.  Using  the  former  you  slice  horizontally, 
while  with  the  latter  you  slice  vertically. 

The  horizontal  slice  of  the  D-W  method  partitions  the  row  indices  TZ  into  two  sets. 
We  name  them  top  and  bottom.  From  them  we  generate  two  D-W  subproblems.  The 
top  ret  corresponds  to  the  traditional  D-W  Master  problem,  a  relaxed  version  of 
(1.1)  defined  on  only  the  constraints  so  indexed,  while  the  bottom  set  corresponds 
to  the  D-W  Slave  problem.  Information  is  passed  up  and  down  between  them  in  the 
decomposition  algorithm. 

The  dual  method,  that  of  Benders,  operates  via  a  partition  of  the  column  index 
scl  C  into  two  sets:  left  and  right.  The  left  is  used  to  generate  the  Master  and  the 
right  the  Slave. 

We  offer  a  caution  on  notation.  The  symbols  for  variables,  i.c.  x  and  t,  are  used 
in  two  ways  that  are  context  sensitive.  In  some  places  these  symbols  denote  the  values 
of  primal  and  dual  variables,  but  in  other  places  they  denote  index  sets  for  columns 
and/or  rows  of  /l,  6,  and  c.  Their  proper  interpretation  should  always  be  clear  from 
their  use. 


1.1  Goldman’s  Resolution  Theorem 

Goldman’s  Resolution  Theorem  [Gol56]  forms  the  basis  for  partially  representing 
feasible  regions  of  subproblems  and  generating  sets  of  necessary  conditions  to  describe 
them.  The  conditions  are  generated  from  successive  solutions  of  the  appropriate 
subproblcms. 

Let  the  closed  polyhedral  set  A  =  {a: :  Ax  >  b,  x  >  0},  where  A  is  a  matrix  of 
finite  dimensions,  and  let  the  sets  V  and  V  consist  of  all  the  extreme  points  and  rays, 
respectively,  of  .4,  the  primal  feasible  region  of  (1.1). 
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Theorem  1.1  (Goldman)  The  set  A  can  be  expressed  as  a  convex  combination  of 
its  extreme  points  V  plus  a  non-negative  combination  of  its  extreme  rays  ?: 

A  =  (x  =  z  Hh  crx,  Vx  <=  conv((P)t  x  £  V,  or  >  0}. 

In  addition,  the  number  of  extreme  points  and  rays  will  be  finite. 

The  finiteness  of  a  decomposition  algorithm  steins  from  the  fact  that  the  number 
of  extreme  points  and  extreme  rays  of  the  polyhedral  set  {x|/lx  >  b,  x  >  0}  is  finite. 


1.2  Solution  Properties  of  Linear  Programs 

In  order  to  enhance  our  geometric  intuition  of  decomposition  algorithms,  we  now 
describe  the  forms  of  information  being  passed  between  subproblcms.  First,  let  us 
define  the  sets  V  and  V  as  the  set  of  ail  extreme  points  and  rays,  respectively,  of  the 
set  B  —  {x  :  Ar n  <  c,x  >  0},  the  dual  feasible  region  of  (1.1).  The  simplex  method 
and  the  following  three  theorems  arc  due  to  Dantzig  (Dan63). 

Theorem  1.2  (Optimal  Solution)  If  an  optimal  solution  to  (1.1)  exists,  the  sim¬ 
plex  method  will  generate  an  optimal  primal  solution,  xgP,  and  a  vector  of  optimal 
dual  multipliers,  £  6  V.  In  addition,  crx  >  6rx,  Vx  €  A,  with  equality  at  x. 

Corollary  1.3  (Separating  Hyperplane  (l))  The  hyperplane  {x  :  crx  =  br7r) 
separates  the  set  A  from  all  points  x  that  could  give  a  lower  value  to  crx,  where 
■ir  €  ©  is  a  vector  of  optimal  dual  multipliers. 

Theorem  1.4  (Unbounded  Solution)  If  the  solution  to  (1.1)  is  unbounded,  the 

•  — 

simplex  method  will  give  an  extreme  point,  x  6  V,  and  an  extreme  my,  x  S  V,  of 
A  such  that  cT(x  -f  ax)  — *  —  oo  as  a  — »  co.  In  addition,  no  feasible  vector  of  dual 
multipliers  ir  exists,  so  B  is  empty. 
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Let  A 
(1.1)  becomes 


-tt) 


and  b  = 


be  corresponding  row  partitions.  The  problem 


min  cTx  =  x 

x>q 

s.t.  Tj  :  /Iji  >  bx 


(1.2) 


r2 :  /t2x  >  6j. 

Define  the  top  set  ,4j  =•  (x  :  /Ijx  >  &,}  and  the  bottom  set  A-x  —  {x  :  /i2x  > 
ha,*  >  0},  where  the  latter  includes  the  non-negativity  constraint.  Note  that  their 
intersection  is  the  original  feasible  region:  A  =  AtD  A7. 


Theorem  1.5  (Infeasible  Solution)  If  there  is  no  feasible  solution  for  (1.2),  the 
simplex  method  will  find  a  vector  of  dual  multipliers  (*i,t2)  €  V  that  form  nn  extreme 
ray  of  the  polyhedron  B  =  {(*i,x2) :  /if*-  +/ijV2  <  c,  (*i,x2)  >  0).  The  ray  satisfies 
A(xx  +  <  0,(jri,ira)  >  0,  and  bjxx  +  b7if7  >  0.  If  we  assume  that  A7  yt  0  thru 

x(Axx  <  x[bi  Vx  €  A7. 


In  the  following  corollary  to  Theorem  1.5,  the  dual  ray  (iFi  tt2)  is  identical  to  that 
in  the  theorem. 

Corollary  1.6  (Separating  Hyperplane  (2))  If  both  Ax  and  A7  are  non-empty, 
the  hyperplane  {x  :  xfAiX  ==  £^1}  strictly  separates  the  sets  A\  and  At,  as  does  the 
hyperplane  {x  :  *j/l2x  = 

Table  1.1  summarizes  the  four  combinations  of  primal  and  dual  feasibility,  and  the 
results  of  the  previous  three  theorems  from  the  classical  theory.  For  each  combination, 
it  lists  the  forms  of  the  primal  and  dual  solutions,  with  the  primal  forms  handled 
by  the  simplex  method  in  bold  face— feasible  optimal,  feasible  unbounded,  and 
infeasible.  In  the  primal  infeasible  cases,  the  dual  ray  is  obtained  at  the  end  of  Phase 
1.  Most  algorithms  terminate  at  this  point  without  determining  a  dual  extreme  point 
when  one  exists. 

Although  the  simplex  method  typically  stops  with  Olil  y  a  dual  ray  when  primal 
infeasible,  it  can  yet  obtain  the  non-bold  face  information.  When  a  problem  is  known 
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Status 

Solution  Form  | 

Primal 
and  Dual 
Optima! 

Primal 

Extreme 

Point 

Dual 

Extreme 

Point 

Primal 
Unbounded 
and  Dual 
Infeasible 

Primal 

Extreme 

Point 

It  Ray 

None 

Primal 
Infeasible 
and  Dual 
Unbounded 

None 

Dual 
Extreme 
Point 
it  Ray 

Primal 
and  Dual 
Infeasible 

HI 

Dual 

Extreme 

Ray 

Tabic  1.1:  Solutions  of  a  primal  formulation. 

to  be  primal  infeasible,  it  is  easy  to  replace  its  right-hand  side  by  one  that  makes 
it  feasible.  It  will  then  finish  cither  optimal  or  unbounded.  If  optimal,  we  have 
the  “Primal  Infeasible,  Dual  Feasible"  case,  and  the  optimal  dual  solution  is  the 
needed  dual  extreme  point.  If  unbounded,  we  have  the  “Primal  Infeasible,  Dual 
Infeasible"  ease,  and  the  ray  associated  with  the  unbounded  solution  is  the  missing 
primal  extreme  ray. 

We  now  introduce  the  concept  of  an  oracle.  The  word  oracle  usually  refers  to  a 
magical  source  of  truth.  There  is  not  much  magic  in  our  case,  merely  convenience. 
For  the  purpose  of  argument,  the  manner  in  which  the  oracle  obtains  information  is 
not  as  important  as  the  fact  that  it  does  provide  it,  and  in  a  specific  form.  Our  oracle 
will  provide  solutions  to  linear  programs. 

Definition  1.7  (An  Oracle)  When  consulted,  an  oracle  (?(•)  offers  a  “solution”  for 
linear  programs.  In  the  case  of  (1.1),  the  oracle  will  generate  cither: 


1.  primal  and  dual  optimal  extreme  points  x  and  i  satisfying  cTx  =  Iffz,  or 

2.  a  primal  extreme  ray  x  satisfying  cTx  <  0,  or 
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S.  a  dual  extreme  ray  x  satisfying  bTx  >  0. 

In  eases  2  and  3,  we  use  a  weak  inequality  to  cover  eases  for  which  there  is  a  ray  of 
optimal  solutions. 

Lemma  1.8  The  Phase  t  /  Phase  2  Simplex  Method  can  perform  <9(1. 1). 

Proof:  Compare  the  three  oracle  eases  with  Table  1.1.  I 

This  oracle  forms  the  basis  for  all  of  the  following  algorithmic  results. 


1.3  Dantzig- Wolfe  Decomposition 

We  will  now  review  Dantzig- Wolfe  (D-W)  decomposition  [DW61]  by  partitioning  the 
row  index  set  of  (1.2).  We  present  decomposition  algorithms  as  a  combination  of  two 
parts:  (a)  the  subproblcm  formulations,  and  (b)  the  protocol  for  passing  information. 


1.3.1  The  D-W  Subproblems 


In  a  D-W  decomposition  scheme,  let  X.j  €  V*  U  “Pj  be  an  extreme  point  or  extreme 
ray  of  -4j  =  {x  :  /t2z  >  63,  z  >  0}.  Let  gj  =  1  in  the  ease  of  the  former,  and  let 
g}  —  0  in  the  latter.  Let  all  such  vectors  X.j  form  the  columns  of  a  matrix  A'.  Then 
by  Goldman’s  Theorem,  any  point  z  6  Ai  can  be  represented  by 


z  =  A7,  />0 
1  -  gTl , 


(1.3) 


for  some  choice  of  variables  /.  The  choice  of  l  is  not  necessarily  unique.  Substituting 
the  constraints  on  z  from  (1.3)  for  those  corresponding  to  the  region  Ai  in  (1.2),  we 
obtain  the  “Master”  problem  of  the  D-W  decomposition  scheme: 


min  (Fx  =  z\ 

l>0,r 

s.t.  0  :  g1 7  =1 

XI  —  7z  =  0 
«i  :  A\x  >  61. 


(1,1) 
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This  system  is  of  course  equivalent  to  solving  (1.2).  On  the  surface  it  would  appear 
that  this  transformation  was  made  at  the  expense  of  greatly  increasing  the  number 
of  variables  by  including  /.  Suppose,  however,  that  the  oracle  is  consulted  on  a 
formulation  of  (l.*l)  where  the  columns  of  -V  contain  only  a  subset  of  the  extreme 
points  and  rays  of  A 2.  Let  us  assume  that  the  oracle  returns  primal  and  dual  extreme 
points  ( /  x )  and  ( 0  $  z ).  We  know  that  x  must  be  in  both  Ai  and  and  that 
0gT+  *i>r. V  <  0.  If  our  present  collection  of  extreme  points  and  rays  in  X  is  sufficient 
to  obtain  a  solution  to  (1.2),  there  can  be  no  such  that  0  +  $rx  >  0.  This 

can  be  determined  by  solving  the  “Slave”  problem  of  the  D-W  decomposition  scheme 
with  ( $  5  0)  =  ($  l  0 ),  defined  as  follows: 

min  8w  —  =i 

*2o,u» 

s.t.  v  :  $7x+5to  >  -0  (1.5) 

*2  :  /l2x  >  b j. 

The  motivation  for  this  problem  is  to  answer  the  question: 

Is  there  a  point  x  Q  Aj  such  that  0  +  yrx  >0? 

For  a  dual  feasible  solution  to  (1.5)  v  will  equal  1,  meaning  that  its  corresponding 
constraint  is  binding.  In  which  case,  iw  =  —  $Tx  —  0  and  we  are  minimizing  to  over  all 
x  Q  Ai.  Therefore,  if  r2  >  0,  there  can  be  m#  x  €  ^2  such  that  0  +  $Tx  >  0,  and  in 
answer  to  the  above  question:  there  is  no  such  point.  Fhrthcr,  there  are  no  extreme 
points  or  rays  of  A2  which  if  added  to  our  present  collection  in  X  could  improve  the 
overall  solution.  We  have  a  solution  to  (1.1). 

In  (1.5),  ($,6,0)  is  an  oracle-provided  extreme  point  or  extreme  ray  of  the  dual 
feasible  region  of  (1.4)  for  sonic  (A,  ^2).  If  it  is  a  dual  extreme  point,  ($,5,0)  = 
($,  1,0),  and  if  it  is  a  dual  extreme  ray,  ($,5,0)  =  ($,0,0). 

When  5  equals  zero  in  the  extreme  ray  case,  (1.5)  will  have  a  vacuous  objective 
and  becomes  a  feasibility  problem.  We  need  only  find  a  feasible  point  to  solve  it.  The 
next  section  details  the  D-W  method  of  solution,  in  which  we  will  sec  that  having 
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5  =  0  directs  the  Slave  to  Find  points  in  Ai  that  could  make  an  infeasible  Master 
feasible. 

Note  in  the  Master  that  if  there  arc  no  extreme  points  among  the  columns  of  A' 
then  fj  =  C  and  (1.4)  is  infeasible  via  0  *  /  =  1.  To  ensure  that  Ai  contains  at  least 
one  extreme  point  we  make  x  non-negative  in  Ai  rather  than  in  „4|. 

Equation  (1.4)  is  commonly  referred  to  as  the  Master  problem  because  its  dual 
solutions «/» impact  the  objective  of  (1.5),  the  Slave,  to  select  extreme  information  from 
the  set  Ai  that  will  lead  to  an  overall  optimum  solution  of  (1.2).  In  the  following 
chapters,  the  Mastcr/Slavc  distinction  is  not  sufficient  when  the  dual  of  this  algorithm 
is  incorporated.  For  this  reason,  all  such  LPs  will  be  referred  to  as  subproblcms  (being 
subordinate  to  the  original  problem),  and  further,  (1.4)  will  be  referred  to  as  the  top 
subproblcm,  and  (1.5)  as  the  bottom  subproblem. 

1.3.2  The  Dantzig- Wolfe  Method 

In  the  Dantzig- Wolfe  method,  the  top  subproblcm  (1.4)  is  solved  with  Vi  and  Vi 
restricted  to  promising  subsets  of  the  extreme  points  and  extreme  rays  of  Ai.  Initially 
those  subsets  are  empty  and  we  need  to  build  them  up  to  the  point  where  they  are 
sufficient  for  determining  the  solution  to  the  original  problem  (1.2).  On  each  major 
iteration  between  solving  (1.4)  and  (1.5),  one  of  the  sets  is  expanded:  V  If  (1.5)  is 
optimal,  or  V  if  (1.5)  is  unbounded.  In  (1.4)  the  values  of  x  are  restricted  to  be 
convex  combinations  of  the  points  in  Vi  and  non-negative  linear  combinations  of  the 
rays  in  Vi. 

In  the  spirit  of  Theorem  1.1,  the  constraints  associated  with  the  dual  variables  0 
and  tp  in  (1.4),  along  with  /  >  0,  form  a  partial  representation  of  Ai.  Consistent  with 
our  earlier  definitions,  we  define  this  set  as 

Ai  =  {x  :  x  =  XI,  gTl  =  1,  /  >  0} 


•  •  •  *1*  * 
minimize  e  x,  x  6  Ai  fl  Ai. 


and  so  (1.4)  becomes 
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Wc  pictorially  represent  the  D-W  decomposition  of  (1.2)  in  Figure  1.3.  On  the 
left  is  an  anatomic  representation  of  the  row  index  partition,  and  on  the  right  is  a 
2-dimensional  geometric  representation  of  the  intersection  of  the  two  polyhedral  sets 
.4|  and  -4j. 


Figure  1.3:  Partitioning  constraints  and  set  intersection. 

Wc  graphically  and  geometrically  represent  the  D-W  algorithm  with  its  top  and 
bottom  subproblcms  (1.4)  and  (1.5)  in  Figure  1.4.  On  the  left,  the  two  circles  repre¬ 
sent  the  subproblcms,  and  the  arrows,  or  arcs,  represent  channels  of  communication 
for  their  solution  information.  This  diagram  will  be  referred  to  as  a  communication 
network,  on  which  the  decomposition  algorithm  bases  its  protocol  for  passing  mes¬ 
sages.  The  arcs  in  the  diagram  arc  of  two  types:  up  and  down.  Up  arcs  always  pass 
primal  solutions  that  arc  collected  at  the  destinations  and  used  to  form  partial  rep¬ 
resentations  of  the  primal  feasible  regions  of  the  sources.  A  down  arc  always  passes 
dual  solutions,  of  which  only  the  most  recent  is  retained  at  the  destination  and  used 
to  modify  the  objective  function  of  that  subproblcm. 


Figure  1.4:  Subproblem  communication  and  partial  representation  of  A2. 
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On  the  right  of  Figure  1.4  arc  2-dimensiona!  geometric  representations  of  the 
fcasiole  regions  of  (1.4)  and  (1.5).  The  two  dots  in  the  corners  of  the  region  Ai  arc 
the  totality  of  its  extreme  points,  some  of  which  are  passed  to  (1.4).  The  region  Ai 
in  the  drawing  above  A 2  is  based  on  any  combination  of  extreme  points  and  rays 
passed.  One  of  the  rays  is  drawn  twice  to  show  its  dependence  on  the  extreme  points 
passed.  The  six  dots  in  the  intersection  of  A\  and  Ai  are  the  possible  extreme  point 
solutions  to  (1.4),  depending  on  which  combination  of  the  extreme  points  and  rays  of 
Ai  were  used  to  construct  A j. 

Given  that  (1.4)  is  initially  infeasible,  the  first  order  of  business  is  to  find  a  point 
in  A\  fl.42  in  order  to  demonstrate  the  feasibility  of  (1.2),  then  to  find  a  feasible  point 
that  minimizes  the  objective  function.  The  next  theorem  describes  an  algorithm  that 
accomplishes  these  tasks.  The  set  ^2  =  {x  |x  +  Of x,  Vi  €  conv(V 2)  and  1  6  Vi} 
begins  empty  and  is  augmented  in  each  cycle  between  Steps  2  and  3.  It  in  turn 
defines  the  added  data  <7  and  A'  in  the  formulation  of  (1.4). 

Theorem  1.9  (Dantzig- Wolfe  Method)  This  pivccdure  performs  0(1.2): 

1.  Let  "Pi  =  "Pi  —  0. 

2.  Consult  0(1.4)  flnd  if  it  returns 

•  an  optimal  dual  extreme  point,  let  ($.5,0)  <—  ($,  1,0); 

•  a  primal  extreme  ray,  STOP — 0(1.2)  is  x; 

•  a  dual  extreme  ray,  let  ($,5,0)  «—  ($,0,0). 

3.  Consult  0(1.5)  and  if  it  returns 

•  an  optimal  primal  extieme  point,  and 

i)  zi  <  0,  let  Vi  U  {x}  and  go  to  Step  2; 

ii)  ^2  >  0,  STOP — if  5  =  1,  0(1.2)  is  x  fwm  0(1. 4)  and  (ij  ij),  else 
0(1.2)  is  i^); 

•  a  primal  extreme  ray,  let  "Pi  *—  Vi  U  {z}  and  go  to  Step  2; 

•  a  dual  extreme  ray,  STOP—  0(1.2)  is  (0  irj). 

Proof:  We  will  work  from  four  cases  and  then  show  finiteness. 
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Case  1:  If  Ai  =  C  and  Aj  ^  0,  0(1.4)  will  finish  infeasible,  implying  = 
xftbi  -  /Ijz)  >0  Vi  6  Aj.  Then,  0(1.5)  will  return  optimal  extreme  points  with 
i2  >  0.  0(1.2)  returns  (x\  x2 ). 

Case  2:  If  -4j  ^  0  and  .42  =  0,  0(1.5)  will  return  a  dual  ray.  0(1.2)  returns 

to  ?r). 


Case  3:  If  ,42  ^  0  and  At  ji  9  and  in  Step  2,  0(1.4)  is  a  dual  ray,  then 
by  Theorem  1.5,  the  hyperplanc  {x  :  0  +  $x  =  0}  strictly  separates  .4i  and  .A2. 
According  to  Step  ")  and  Equation  (1.5)  we  must  find  a  point  as  far  as  possible  on  the 
opposite  side  of  this  hypcrplane  from  „42  that  also  lies  in  A2-  If  no  such  point  exists, 
i2  =  0  and  the  original  problem  (1.2)  must  be  infeasible;  0(1.2)  returns  (xf  xf). 
On  the  other  hand,  if  one  does  exist,  go  back  to  Step  2. 

Case  4:  If  -42  7^  0  and  ^  0  and  in  Step  2,  0(1.4)  is  a  dual  point,  then  by 
Theorem  1.2,  the  hypcrplane  {x  :  0  +  ifix  =  0}  separates  .42  from  all  points  x  £  ,42 
that  could  give  a  better  value  of  crz.  According  to  Step  3  and  Equation  (1.5)  we  must 
find  a  point  as  far  as  possible  on  the  opposite  side  of  this  hyperplanc  from  ,42  that 
also  lies  in  ,42.  If  no  such  point  exists,  the  original  problem  (1.2)  must  be  optimal, 
and  0(1.2)  returns  z,  (xf  xj"),  where  z  is  the  present  prim.il  optimal  solution  to 
(1.4).  On  the  other  hand,  if  one  does  exist,  go  back  to  Step  2. 

Finiteness:  Step  3  can  never  pass  the  same  information  twice  because  any  dual 
solution  from  (1.4)  satisfies  0  +  rpx  >  0  for  all  z  6  ,42,  and  the  procedure  would 
stop  either  optimal  or  infeasible.  The  procedure  is  finite  because  it  is  drawing  upon  a 
finite  set  of  extreme-point  and  extreme-ray  data  of  ,42  that  can  be  passed  from  (1.5) 
to  (1.4).  At  any  point  in  the  algorithm  some  subset  of  this  information  is  in  the  top 
subproblem.  Each  such  subset  must  be  different  because  the  top’s  feasible  region  is 
expanded  in  each  cycle.  Since  the  number  of  subsets  is  finite  and  none  is  repeated, 
the  algorithm  must  eventually  stop.  I 
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1.3.3  The  D-W  Communication  Network 

The  goal  of  this  chapter  and  the  next  is  to  characterize  the  space  of  all  decomposition 
schemes  by  deriving  the  subproblem  formulations  in  increasingly  complex  steps.  At 
each  step,  the  system  of  subproblems  will  be  transformed  into  an  equivalent  symbolic 
representation,  and  a  symbolic  operation  will  be  defined  to  mimic  the  decomposition 
step. 

To  introduce  symbolic  decomposition  and  to  develop  a  formal  representation  of 
D-W  decomposition,  let  us  formalize  the  concept  of  a  communication  network.  It 
is  a  symbolic  representation  of  a  decomposition  scheme  containing  the  information 
necessary  to  define  all  of  the  subproblcms,  symbolized  by  nodes,  and  their  interactions, 
symbolized  by  arcs. 

Let  QN  be  the  set  of  all  communication  networks  on  N  nodes.  These  networks  arc 
alternative  representations  of  LP  decomposition  schemes.  We  will  define  a  transfor¬ 
mation  from  the  space  of  all  decomposition  schemes  to  all  communication  networks. 
It  will  be  shown  that  this  transformation  is  reversible. 

Definition  1.10  (Communication  Network)  A  communication  network  is  a  five- 
tuple.  For  example, 


g  =  {Af,Un,Cn,A,Ta),  VnGA/’,  a  6  .4, 

where  the  tuples  are  defined  to  be 
Af  set  of  nodes,  Af  0, 

Tin  node  n ’s  row  index  set, 

Cn  node  n’s  column  index  set, 

A  set  of  arcs,  A  =  {(ni,n2)  G  Af2 :  there  is  an  arc  from  node  n\  to  n^},  and 
Ta  the  type  for  arc  a  (up,  down,  left,  or  right),  where  if  A  =  0  there  are  no  %. 

The  arc  types  left  and  right  are  used  in  the  Benders  decomposition  method  and 
explained  later.  They  are  included  here  for  completeness  of  the  definition. 
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The  simplest  network  is  one  with  only  one  node  and  no  arcs.  It  symbolizes  a 
linear  program  that  has  not  been  decomposed.  We  offer  g\  as  an  example: 

The  node  is  numbered  zero.  The  sets  of  row  and  column  indices  arc  respectively  « 
and  x.  The  set  of  arcs  is  empty,  and  the  arc  types  arc  not  applicable.  The  network  Q\ 
symbolizes  everything  in  the  formulation  of  the  problem  (1.1)  except  the  actual  values 
of  the  data  (A,6,c).  A  communication  network,  together  with  the  data  (/l,6,c),  is 
sufficient  information  to  obtain  a  solution  to  (1.1). 

Subproblcm  index  sets  arc  used  in  the  definition  of  the  forward  transformation, 
which  was  first  depicted  in  the  shaded  region  of  Figure  1.2.  An  overbar  distinguishes 
them  from  the  node  index  sets  of  communication  networks.  Typically,  7J,,  D  7 Zn  and 
Cn2Cn- 

Definition  1.11  (Subproblcm  Index  Sets)  Lei  Ike  set  On  contain  nil  row  indices 
of  the  linear  program  subpwblcm  associated  with  node  n,  and  letCn  contain  all  column 
indices  of  the  linear  program  subproblcm  associated  with  node  n. 

Thus  in  the  example  above  with  one  node  and  no  arcs,  Hq  =  tt,  i.c.,  all  rows,  and 
Co  =  x,  i.c.,  all  columns. 

We  now  define  a  communication  network  based  on  the  Dantzig- Wolfe  (top  and 
bottom)  subproblems.  This  operation  was  referred  to  when  explaining  Figure  1.2. 
We  are  taking  the  initial  step  from  linear  programs  to  communication  networks. 

Definition  1.12  (Forward  Transform)  The  forward  transform  from  a  D-W  de¬ 
composition  scheme  to  the  communication  network  go  is  a  five-step  process: 

1.  Define  the  set  A f  having  one  node  for  each  subproblcm. 

2.  For  each  node  n  €  A1",  define  the  elements  of  the  ro w  index  set  TZ„  =  %n  fl  TZ. 

3.  For  each  node  n  6  Af ,  define  the  elements  of  the  column  index  set  Cn  —CnnC. 


1.3.  DANTZIG-WOLFB  DECOMPOSITION 


17 


f.  For  each  subproblcm  n  i  6  Af  containing  constraints  indexed  by  0  and  $  that 
receive  information  from  another  subproblcm  n2  €  Af,  define  an  arc  (n2 »i)  €  A 
of  type  T„3n,  =  up. 

5.  For  each  subproblcm  nt  6  Af  containing  constraints  indexed  by  v  that  receive 
information  fwm  another  subproblcm  n2  €  Af,  define  an  arc  (n2  nt)  6  A  of  type 
Tnjn,  =  down. 

With  these  transformation  rules,  we  can  derive  the  network  in  Figure  1.4  corre¬ 
sponding  to  the  two  Dantzig- Wolfe  subproblcms  (1.4)  and  (1.5). 

1.  jV  =  {1,2}  since  there  .arc  two  subproblems,  with  1  corresponding  to  the  top 
and  2  corresponding  to  the  bottom. 

2.  -  x i  and  =  -/  signifying  that  the  rows  arc  partitioned  so  that  those 
associated  with  r,  go  to  the  top  subproblcm  and  those  associated  with  rr2  go 
to  the  bottom  subproblcm. 

3.  Ci  =  Ci  —  x  as  the  columns  were  not  partitioned  (that  comes  later). 

4.  0  and  $  appear  in  (1.4)  so  let  A  *—  A  U  {(21)}  and  7ji  =  up. 

5.  v  appears  in  (1.5)  so  let  A  «—  .4  U  {(12)}  and  Tn  =  down. 

In  summary, 


9d  -  ({1,2},5Ti,7t2,x,z,  {(12), (21)}, down,  up)  €  £2. 

In  this  notation  the  set  of  nodes  appears  first,  followed  by  a  list  of  row  index  sets, 
one  for  each  node,  followed  by  a  corresponding  list  of  column  index  sets.  Next  is  the 
set  of  arcs  followed  by  a  list  of  arc  types. 
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Theorem  1.13  (The  Reverse  Transform)  The  reverse  transform  from  go  back 
to  a  DAY  decomposition  scheme  proceeds  as  follows: 

1.  For  each  node  n,  create  a  subproblcm  beginning  with  the  original  constraints  of 
the  form 

A:.x  >  bi ,  Vi  G  Rn- 

2.  Coiresponding  to  the  up  arc  from  2  to  1,  include  constraints  in  the  lop  subprob¬ 
lem  of  the  form 

grl  —  1,  XI  =  x,  /  >  0, 

wheie  g  and  X  are  information  transported  by  the  up  arc. 

3.  Coiresponding  to  the  down  arc  from  2  to  l,  include  constraints  in  the  bottom 
subproblcm  of  the  form 

Sw>  —0  —  ipx, 

where  (0,3,  t })  is  information  transported  by  the  down  arc.  In  addition,  place 
the  term  +<5w  in  the  objective  row. 

j.  Place  the  term  +crx  in  the  objective  row  of  the  top  subproblem. 

5.  Place  the  non-negativity  constraints  x  >  0  in  the  bottom  subproblcm. 

Proof:  Proof  by  example  (VVLOG).  I 

We  use  the  node  index  set  7?o  in  the  following  definition. 

Definition  1.14  (Dantzig- Wolfe  Operator,  0)  The  Danlzig-Wolfc  operator  B 
maps  G1  into  G2  using  a  partition  [Pi ,  Pi\  ofRo: 

G'b[PuP2}-+G2. 


In  words  this  means: 

Apply  Dantzig-Wolfe  decomposition  to  the  linear  program  associated  with 
node  n  in  the  communication  network.  Partition  the  constraints  so  that 
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those  corresponding  to  the  dual  variables  in  Pi  arc  in  the  top  (D-W  Mas¬ 
ter)  subproblcm  and  those  corresponding  to  the  dual  variables  in  Pi  arc  in 
the  bottom  (D-W  Slave)  subproblcm.  The  algorithm  described  in  Theo¬ 
rem  1.9  is  implicitly  associated  with  the  resultant  communication  network. 

This  is  our  most  elementary  row  partition.  Begin  with  the  network  gx,  which  has 
only  one  node,  and  map  it  into  a  two-node  network  corresponding  to  the  left  side  of 
Figure  1.4.  Thus,  g\  =  ({0},*i  Ux2,i,0)  and 

9D  = 

where  go  was  given  above.  From  go  we  can  determine  which  node  is  the  top  or 
bottom  by  the  arcs  that  link  them.  The  up  arc  must  be  destined  for  the  top  node 
(subproblem). 

The  next  definition  is  needed  to  establish  an  equivalence  between  subproblcm 
schemes  and  communication  networks.  We  define  the  Inverse  Operator  on  only  those 
collections  of  nodes  that,  once  collapsed,  can  be  re-split  to  regain  the  original  net¬ 
work.  Later  we  will  rely  on  this  reversibility  property  of  the  inverse  operator  when 
characterizing  the  space  of  all  communication  networks,  QN . 

Definition  1.15  (Inverse  Operator,  □ )  The  inverse  operator  □  takes  a  set  of 
nodes  and  collapses  it  back  into  a  single  node.  It  is  defined  such  that 

9i=(ffiB{P1)Pi})0{l)2}, 
for  all  partitions  [Pi,  P2]  of  Pa¬ 
in  the  expression  gx  =  go  □  {1,2},  two  nodes  are  collapsed  into  one,  and  the  arcs  are 
discarded. 

For  networks  with  more  than  two  nodes,  the  inverse  operator  can  be  thought  of 
as  identifying  implicit  subproblems,  i.e.,  collections  of  subproblems  that  imitate,  in 
concert,  a  subproblem  that  does  not  exist  explicitly.  This  is  explained  more  carefully 
in  Chapter  Two. 
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The  implication  of  the  triple  (the  transform,  the  D-W  operator,  and  the  inverse 
operator)  is  that  we  have  created  the  symbolic  space  G7  within  which  we  can  mimic 
the  algebraic  manipulations  of  decomposition.  By  partitioning  the  index  set  of  a 
node  in  a  specific  way,  we  mimic  the  creation  of  two  subproblcms  from  a  single  LP. 
Alternatively,  the  original  LP  can  be  regained  by  combining  the  index  sets  of  the  two 
nodes,  also  in  a  specific  way. 


1.4  Benders  Decomposition 

The  purpose  of  this  section  is  to  derive  an  oracle  for  LPs  that  have  been  sliced 
vertically  (partitioned  by  columns)  by  dualizing  the  concepts  we  have  discussed  for 
those  sliced  horizontally  (partitioned  by  rows).  The  theorems  and  definitions  of  the 
previous  section  will  thus  return  in  their  dual  forms.  To  characterize  Gl  and  G7* 
we  first  complete  the  forward  transform  by  including  vertical  slicing  and  deriving  its 
companion  oracle.  Then  we  define  in  sequence:  the  dual  operator,  its  inverse,  and 
the  dual  network. 


1.4.1  The  Subproblems 

Now  consider  partitioning  the  LP  (1.1)  where  A  =  ( /I3  A| )  and  cr  =  ( cj  cj[ ),  and 
<7,  =  ({0},tt,xj  Ux2, 0).  The  index  sets  Xj  and  x2  form  a  partition  of  the  column 
index  set  x  of  (1.1)  and  it  is  its  row  index  set.  With  this  column  partition  the  LP 
problem  becomes 


min  cfxi  -f  c2'x2  =  z 

rj|o 


(1.6) 


s.t.  tt  :  A3X]  +A|X2  >  b. 

The  subproblcms  and  the  oracle  for  Benders  decomposition  [Ben62]  can  be  derived 
directly  from  D-W  decomposition  by  replacing  primal/dual  steps  by  corresponding 
dual/primal  steps.  We  want  the  decomposition-style  oracle  (9(1.6)  that  utilizes  (9(1.7) 
and  (9(1.8).  A  vertical  slice  through  A  between  xi  and  x2  partitions  its  column  index 
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set  C.  To  derive  the  Headers  subproblcms,  we  take  the  dual  of  (1.6),  apply  the  D-W 
operator,  and  then  take  the  duals  once  again  of  the  resulting  subproblems. 

Taking  the  dual  of  (1.6)  gives 


max  &rx  =  z 

*>o 

s.t.  ij  :  Ajx  <  cj  (1-6)° 

X2  :  /ijx  <  C2, 


where  we  have  introduced  the  notation  (*)°  to  indicate  the  dual  of  the  LP  argument. 

Applying  D-W  decomposition  to  this  problem  as  was  done  for  (1.2)  gives  us  the 
two  subproblem  formulations  (1.7)°  and  (l.S)D.  We  have  substituted  the  variables 
(i,j/,A,u,to)  for  (0,\M,o,w)t  and  the  data  (7,n,£, <I,t)  for  (g, A', 0,$,0).  They 
arc  corresponding  Greek  and  Roman  cliaractcrs.  Specifically, 


and 


max  bTx  =  z, 

K,.V>0 

s.t.  t :  7tA  =  1 

y  :  flX  -  hr  =  0 

xi  :  Ajjx  <  ci, 

max  (ho  =  z2 

*>0,u 

s.t.  u  :  y Tx  +  du>  >  —t 

3^2  •  ^  CO([ 


(l.S)° 


Define  Bx  =  {x  :  /ijx  <  c,}  and  B2  =  {x  :  /ijx  <  c7,x  >  0}.  Note  that  if  B  is  the 
feasible  region  of  (1.6)°  then  B  =  BXC 1  B2t  and  the  columns  of  the  matrix  fi  contain 
extreme  points  and  extreme  rays  of  B2.  We  define  V2  and  V2  to  be  the  respective 
subsets  of  the  extreme  points  and  rays  of  B2  and  get  fl.j  6  Z>2  U  V2.  Since  we  define 
7>  to  be  1  if  fl.;-  €  V2  and  li  otherwise,  the  x  in  (1.7)D  must  be  an  element  of  B  if  V2 
and  "D2  contain  all  of  the  extreme  points  and  extreme  rays  in  B2. 
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The  respective  duals  of  the  previous  two  maximization  problems  give  us  the  stan¬ 
dard  subproblcms  of  Benders  decomposition: 

min  cTxi  +  l  =  z . 
s.t.  j r :  A$x\  -  Iy  =6  (1.7) 

A:  flry+7i>0, 

and 

min  cJi2  —  tu  =  r2 

X3>0 

ugO 

s.t.  w :  du  —  d 

x  :  Ajij  +  t/»  >  0. 

1.4.2  The  Benders  Method 

To  construct  a  decomposition  procedure  that  performs  0(1.6),  we  first  make  the 
following  definition. 

Definition  1.16  (Dual  Oracle)  The  dual  of  an  oracle,  symbolized  as  0D(-),  inter¬ 
prets  the  dual  solutions  of  O(-)  as  primal  solutions,  and  the  primal  solutions  of  O(-) 
as  dual  solutions. 

As  defined,  the  dual  of  the  dual  oracle  is  the  original  oracle.  We  get  the  following 
property  by  combining  the  dual  oracle  with  the  dual  of  a  linear  program. 

Property  1.17  (Oracle  Dual  Symmetry)  The  oracle  £?(•)  is  dual  symmetric  in 
that 

0D(-)D  =  O(-). 

We  are  reusing  our  notation  (')D  to  indicate  the  dual  of  the  LP  argument.  To  verify 
this  property,  consult  Table  1.1  and  note  that  the  simplex  method  can  perform  0(-)D. 

Corollary  1.18  (Benders  Method)  The  following  procedure  can  perform  0(1.6): 

1.  LctV  =  V  =  0. 

2.  Consult  0(1.7)  and  when  it  returns 
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•  an  optimal  primal  extreme  point,  let  (i/i<l,t)  «—  (y,  1,0/ 

•  a  dual  extreme  ray,  STOP— 0(1.6)  is  x; 

•  a  primal  extreme  ray,  let  ( y,d,t )  *—  (y,0 ,t). 

3.  Consult  0(1. 8)  and  when  it  returns 

•  an  optimal  dual  extreme  point  and 

i)  z2  <  0,  let  V2  ♦—  Vi  U  {x},  and  go  to  Step  3; 

ii)  z2  >  0,  STOP— if  5=1,  0(1.6)  is  (if  if)  and  x,  the  latter  coming 

from  0(1.7),  otherwise  0(1.6)  is  (if  if); 

—  — * 

•  a  dual  extreme  my,  let  V2  ♦—  V2  U  {x},  and  go  to  Step  2; 

•  a  primal  extreme  ray,  STOP — 0(1.6)  is  (0  if). 


Proof:  Begin  with  the  Dantzig- Wolfe  Oracle  in  Theorem  1.9,  and  wherever  O(-) 
is  consulted,  rcpliicc  that  consultation  by  0°(-)D,  using  the  Oracle  Dual  Symmetry 
Property.  Next,  for  each  cf  the  three  cases  for  solutions,  interchange  the  words  primal 
and  dual ,  and  replace  the  oracle  consultations  as  before,  but  with  0(*)D,  using  the 
definition  of  a  dual  oracle.  Finally,  replace  the  subproblcm  formulations  with  their 
duals,  and  replace  the  oracle  consultations  as  before  but  with  0(*),  using  the  definition 
of  (*)°.  The  oracles  0(1.7)  and  0(1.8)  for  the  Benders  left  and  right  subproblems 
then  become  the  results  we  require  to  complete  0(1.6).  I 

As  a  note  on  the  milestones  of  the  procedure  above,  once  having  found  x  G 

H  B2,  we  have  demonstrated  primal  boundedness  for  (1.6).  To  continue,  we  must 
work  toward  dual  optimality  in  order  to  show  primal  optimality.  If  we  find  dual 
unboundedness  then  the  primal  form  (1.6)  is  infeasible. 

We  will  anatomically  and  graphically  represent  the  Benders  decomposition  of  (1.6) 
as  Figure  1.5.  On  the  left,  the  matrix  A  is  sliced  vertically  between  ij  and  x2  to 
symbolize  the  partition  of  the  set  x  into  the  sets  ii  and  x2.  On  the  right  is  the 
communication  network,  on  which  the  algorithm  bases  its  protocol  for  passing  infor¬ 
mation  and  making  modifications.  The  right  arc  always  passes  primal  solutions,  of 
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which  only  the  most  recent  is  retained  at  the  destination  (node  2)  and  used  to  modify 
the  right-hand  side  of  the  subproblcm.  The  left  arc  always  passes  dual  solutions  that 
arc  collected  at  the  destination  (node  l)  and  used  to  form  a  partial  representation  of 
the  dual  feasible  region  of  the  source  node. 

033 

Figure  1.5:  Partitioning  constraints  and  communication. 


1.4.3  Dual  Communication  Network  Theory 

The  network  theory  corresponding  to  the  above  subproblcms  and  oracle  derivations 
is  itself  replete  with  the  use  of  duality  concepts.  First,  the  Forward  Transform  (frem 
subproblcms  to  networks)  needs  two  new  rules  that  arc  dual  to  Rules  4  and  5  and 
serve  to  transform  the  Benders  (left  and  right)  subproblcms.  Next,  we  present  the 
dual  to  the  D-W  network,  which  is  generated  by  the  Benders  operator. 

Definition  1.19  (Forward  Transform  continued)  The  forward  transform  from  a 
Benders  decomposition  scheme  is  a  five-step  process.  The  firsL  three  steps  are  taken 
as  those  in  the  prior  Forward  Transform  definition,  and  the  last  two  arc  additions  to 
the  prior  that  complete  the  definition  over  Q 1  and  Q2. 

6.  For  each  subproblem  rij  6  A f  containing  variables  named  y  and  l,  and  for  every 
other  subproblcm  n2  €  A1"  that  provides  information  for  those  columns,  define 
an  arc  (n2«i)  6  A  of  type  Tnvu  =  left. 

7.  For  each  subproblem  €  Af  containing  variables  named  u,  and  for  every  other 
subproblcm  n2  €  Af  that  provides  information  for  those  columns,  define  an  aic 
(n2n,)  6  A  of  type  Tnjni  —  right. 
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Theorem  1.20  (Dual  Reverse  Transform)  The  reverse  transform  back  to  a  Ben¬ 
ders  decomposition  scheme  from  qh  proceeds  as  follows: 


1.  For  each  node  n,  create  a  subproblcm  beginning  with  the  original  columns  x  of 
the  fowl 


I  A-j  I  ,  v;ec„. 

\  0  J 

2.  Con'csponding  to  the  left  arc  fwm  2  to  1,  include  columns  t  and  y  in  the  left 
subproblcm  of  the  form 

m  /  o  \ 

0  and  —I  , 

W  In/ 

where  7  and  H  are  information  transported  by  the  left  arc.  The  constraints 
indexed  by  A  arc  >  ones. 

8.  Corresponding  to  the  right  arc  from  2  to  l,  include  columns  in  the  right  sub¬ 
problem  of  the  form 


t  . 


where  (f,  d,y)  is  information  transported  by  the  right  ate.  In  addition,  place  the 
term  dw  in  the  right-hand  side. 

4.  Place  b  in  the  right-hand  side  of  the  left  subproblcm. 


5.  Make  the  right  subpwblcm’s  constraints  indexed  by  tt  into  >  ones. 


Proof:  This  theorem  is  derived  from  the  reverse  transform  in  the  same  manner  that 
Benders  decomposition  was  derived  from  D-YV  decomposition.  I 
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Using  the  forward  and  reverse  transform  on  the  D-VV  and  Benders  subproblcms 
and  networks,  and  the  duality  of  those  subproblcms,  we  can  obtain  a  duality  theorem 
for  the  networks.  The  following  definition  will  help  with  the  mechanics. 

Definition  1.21  (Transpose  Arcs)  a)  up  transjwsc  is  type  left,  b)  doum  transpose 
is  type  right,  c)  left  transpose  is  type  up,  d)  right  transpose  is  type  doum. 

Property  1.22  (Arc  Duality)  The  transpose  of  the  transpose  of  an  arc  type  is  the 
same  type. 


At  this  point  we  introduce  our  first  duality  theorem  for  communication  networks. 

Theorem  1.23  (Network  Duality)  7b  take  the  dual  of  a  network  g  with  nodes  jV 
and  arcs  A,  interchange  the  ww  and  column  index  sets  of  each  node  and  transpose 
all  arc  types.  We  call  the  result  ga  and  note  that  the  dual  of  gD  is  g.  The  problem 
data  becomes  (-/l,  -b,  -c)  so  that  minimize  switches  with  maximize,  and  >  switches 
with  <. 


Proof:  We  can  work  either  way  through  the  sequence 


Xfonn  Dual 

3D  w  0(1.2)  H  0(1.8) 


Xform 


which  is  necessary  and  sufficient  for  the  short  form 

Dual 

9d  ^  9b< 


I 


The  application  of  the  D-W  operator  to  dual  networks  as  in 

9  b  = 


creates  a  new  operator  for  our  symbolic  calculus. 

Definition  1.24  (Benders  Operator,  DO)  The  Benders  operator  maps  networks 
in  Gl  into  those  in  G2  using  a  partition  [Pi ,  P2]  of  Co,  a  subproblem  index  set  for 
node  n. 
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In  words  this  means: 

Apply  Benders  decomposition  to  the  linear  program  corresponding  to  the 
node  to  be  split  in  the  communication  network.  Partition  the  variables 
so  that  xi  is  in  the  left  ( Benders  Master)  subproblem  and  x2  is  in  the 
right  (Benders  Slave)  subproblem.  The  governing  algorithm  described  in 
Corollary  1.18  is  implicitly  associated  with  the  resultant  communication 
network. 

This  ease  also  begins  with  one  node  and  no  arcs,  gt  =  ({0},*,xi  Ux2,0),  and 

9b  “  m  (*«  •  *a]» 

where  gs  =  ({l,2},jr,ff,xi,x2,  {(12), (21)}, right,  left). 

Property  1.25  (Benders  Inverse  Operator)  The  inverse  operator  □  is  applica¬ 
ble  to  the  Benders  operator  as  well  as  the  D-  W  operator: 

9i  =  <7,m(Pi,P2]  □  {1,2}, 

for  all  partitions  (Pl ,  P2]  of  Co- 

Proof:  Since  we  know  that  gx  =  [gx  B[Pi,P2])  □  {1,2}  already,  by  using  net¬ 
work  duality,  it  must  also  hold  that  gx  =  g?  =  (#Pb[Pi,P2])  □  {1,2}.  In  ad¬ 
dition,  since  both  gg  —  #Pb[xi,x2]  and  gs  =  g\  ffl[xi,x2],  we  now  have  <71  = 
(gx  nj(xi,x2))°  O  {1,2}.  But  inversion  is  not  concerned  with  indices  or  arc  types,  so 
finally,  gx  =  (gx  m[xj,x2])  □  {1,2}.  I 


1.5  Subproblem  Interfaces 

The  positions  of  nonzeros  in,  and  the  partitioning  of,  the  *  .istraint  matrix  afreets 
the  quantity  of  information  communicated  between  subproblems.  Vacuous  columns 
in  a  row  partition  let  the  corresponding  variables  be  free  of  constraints.  When  this 
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occurs,  there  is  no  need  to  pass  information  indicating  that  a  variable  is  free.  It  is 
instead  possible  to  make  this  fact  implicit  in  the  formulation  of  the  other  subproblem. 

The  subproblcm  interface  theorem  characterizes  the  portions  of  subprobiem  so¬ 
lutions  that  arc  exchanged.  It  ia  based  on  the  following  LP  formulation,  which  uses 
a  partitioning  of  the  columns  of  (1.2)  so  that  A\  —  ( jdjj  /i>2 ) ,  /I2  =  ( /I21  ^32  )i 
and  cT=(c[  cT).  Thus,  the  LP  of  interest  is 

min  clxi  +  cTx%  —  = 

x*  5o 

s.t.  a-j  :  /Ijjij  +  ^12X2  ^  b\ 

"2  :  ^213=1  +  A27X2  >  h\ 

with  accompanying  starting  network  gi  defined  as  follows: 

9i  =  ({0},'l  U'2,X,  Ul2,0). 

Theorem  1.26  (Subproblcm  Interfaces)  Lei  us  assume  lhal  (1.9)  is  decomposed 
by  the  D-  IP  method  using  the  row  partition  If  An  —  0,  the  dual  solution  to 

tj)  in  the  top  subproblem  is  a  constant  equal  to  *~c.  The  subproblems  ate  formulated 
as  (1.10)'  and  (1.11)'  below.  Similarly,  if  An  =  0,  the  primal  feasible  region  for  xn 
is  the  positive  orthant  in  the  bottom  subproblcm  and  can  be  exptessed  instead  using 
subproblems  formulated  as  (1.10)"  and  (1.11)". 


Proof:  With  D-W  decomposition  of  (1.9)  using  the  partition  [«i,  “2),  we  get  the 
subproblcm  formulations: 

nun  cjfxn  +  cffxn  = 

s.t.  0 :  gTl  =1 

ft:  A'21  /-/*„  =0  (L1°) 

ft  :  X22 1  —  Ixn  =  0 

/Tj  :  ''111*!!  +  A\2%12  bi, 


and 


min  Sw  =  20  • 

*31  >0 

XJ3>0,U/ 

s.t.  V  :  ftx2i  +  ftx22  +  Sw>—0 
<r 2  :  A21X21  ■{■  A22X22  ^  t2. 


(1.11) 
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Case  1:  If  An  —  0  the  solution  for  ^  always  equals  -c2  by  the  dual  constraints. 
Substituting  X22I  for  ip  in  the  objective  row  of  (1. 10)  and  fixing  =  -ca  in  (l.ll), 
we  get 


min  tZX-nl  +  cfiii  =  ri 

/>0,ru 

s.t.  0:  fl  =1 

:  A'a U  —  lx  u  =  0 


(U0)f 


*3  : 


/tnXn  >  h, 


and 


min  eTiaa  -h  5w  =  z2. 

*31  >0 

X3J,U>  . 

s.t.  v:  $\X2\  +5w>-d 

7Ta  I  AhXi\  +  /la2l'22  ^  6j. 


Case  2:  If  /I22  =  0  the  feasible  region  for  i22  is  the  positive  ortiiant.  Therefore, 
by  moving  the  non-negativity  constraints  for  these  columns  to  the  top  subproblem 
and  eliminating  those  columns  from  the  bottom,  we  get 

min  cjxu  +  fxn  =  zx 

S.t.  0 :  fl  =  1  (L10)« 

rpi  :  X21 1  —  Ixu  =0 

x,:  /lulu  +  Ai7X\2  >  &i, 

and 

min  +5w  =  z2- 

*3l>0,«tf 

s.t.  v :  ^>1X21 +Sw>—  0  (1-11)" 

x2  :  A21X21  ^  b 2- 

In  summary, 


If  An  =  0  then  (1.10)(I.ll)  =  (l.lO)'(l.ll)',  and 
if  A22  =  0  then  (1.10)(1.11)  =  (1.10)"(l.ll)w. 
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We  have  also  proved  that  the  communication  graph  is  not  affected  by  alternative 
subproblem  formulations,  hence  alternative  subproblem  interfaces:  </»  B  {”i ,  ”2]  =  <72 
where 

92  =  ({1.2}, *-,,*2,11  U13.11  Uxa,{(12),(21)},down,  up). 

The  dual  of  the  Subproblcm  Interface  Theorem  yields  the  following  corollary.  The 
subproblem  formulations  arc  left  as  an  exercise. 

Corollary  1.27  (Subproblcm  Interfaces)  Let  us  assume  Hint  (1.9)  is  decomposed 
by  the  Benders  method  using  the  column  partition  [x^x?].  If  /I21  =  0,  the  primal 
solution  to  1/2  in  the  left  subproblem  is  a  constant  equal  to  —63.  Similarly,  if  An  —  0, 
the  dual  feasible  region  for  ~n  in  the  right  subproblem  is  the  jmilivc  oil  hunt. 

In  the  sequel,  unnecessary  variables  and  constraints  will  be  dropped  when  subma- 
triccs  arc  equal  to  zero. 

1.6  Summary 

We  have  reviewed  the  theory  of  Dantzig- Wolfe  and  Benders  decomposition,  and  found 
their  subproblcm  formulations  and  their  algorithms  to  be  duals  of  each  other.  In  the 
process,  we  introduced  the  symbolic  space  of  communication  networks  with  one  and 
two  nodes,  Gl  and  <72  respectively.  The  algebraic  decomposition  of  LP  subproblcms 
is  equated  to  the  splitting  of  node  index  sets  in  communication  networks.  The  set  Gl 
contains  one  network  which  is  self  dual,  and  the  set  Q 2  contains  two  networks  which 
are  duals  of  each  other.  In  the  next  chapter  we  will  explore  the  span  of  decomposition 
schemes  possible  under  our  defined  operators.  It  is  (7‘v.  Since  higher  dimensional 
schemes  are  constructed  upon  lower  dimensional  ones,  and  since  the  two  entries  of  G2 
are  duals,  this  .automatically  divides  all  of  GN  in  half.  Every  scheme  in  one  half  has 
a  dual  scheme  in  the  other;  except  for  Gl,  which  lies  in  both  (or  neither). 

We  have  shown  that  the  duality  of  linear  programming  translates  directly  to  a 
duality  for  networks.  The  next  chapter  characterizes  the  space  of  all  subproblem 
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formulations  and  likewise  their  accompanying  communication  networks.  Once  char¬ 
acterized,  we  can  give  the  transforms  and  the  duality  of  all  networks. 

Finally,  the  pattern  of  zeros  and  nonzeros  in  the  constraint  matrix  affects  the 
interfaces  between  subproblems,  and  even  their  formulations.  By  investigating  these 
patterns,  we  can  drastically  reduce  the  quantity  of  information  communicated  over 
the  network  for  sparse  problems. 


Chapter  2 


Characterizing  Communication 
Networks 


EFFICIENT  use  of  a  parallel  computer  requires  that  there  be  many  subprob* 
lems  that  can  be  solved  independently.  Having  completed  the  derivations 
of  the  top,  bottom,  left  and  right  subproblcms,  and  the  communication  dia¬ 
grams  describing  their  interactions,  we  now  embark  on  an  exploration  of  the  versatility 
of  the  D-W  and  Benders  partitioning  operators  and  their  use  in  creating  many  sub- 
problems  for  a  parallel  computer  to  solve.  The  following  sections  describe  a  variety 
of  decomposition  schemes  that  can  be  generated  with  partially  ordered  sets  of  parti¬ 
tions  within  partitions.  Each  scheme  alone  is  capable  of  performing  the  oracle  on  the 
original  problem,  <9(1.1). 

This  chapter  concentrates  solely  on  partitions,  subproblems,  and  networks;  a  skele¬ 
tal  framework  and  on  which  to  attach  the  muscles,  the  algorithms.  The  next  chapter 
covers  the  parallel  decomposition  oracle,  which  is  a  relaxed  form  of  nesting  oracles. 
Naturally,  we  get  a  serial  oracle  from  the  parallel  one  when  using  only  one  processor. 
For  now  we  take  faith  in  nesting  the  oracle  and  proceed. 
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Figure  2.1:  Symbolic  Decomposition  covered  by  Chapter  Two. 


2.1  Nested  Decomposition 

In  the  previous  chapter  we  described  decomposition  in  terms  of  operations  on  com¬ 
munication  networks  that  form  new,  higher-ordered  networks.  The  term  nested,  in 
the  title  of  this  section,  refers  to  the  practice  of  embedding  one  thing  within  another. 
Using  our  operators,  we  can  nest  partitions  and  oracles  with  a  sequence  of  slices.  For 
the  present,  we  will  nest  only  D-W  decomposition  and  state  simply  that  the  dual  of 
each  operation  we  perform  applies  equally  well  in  the  context  of  Benders  decompo¬ 
sition.  As  an  extension  to  traditional  decomposition,  we  introduce  cross-nesting,  the 
practice  of  using  both  D-W  and  Benders  decomposition  on  the  same  problem. 

We  consider  three  variations  of  nesting  the  D-W  and  Benders  operators  in  the 
network  gp,  which  together  with  their  dual  versions  comprise  the  complete  set  of 
communication  networks  on  three  nodes:  £73.  The  three  variations  are:  splitting  the 
top  with  B  and  splitting  the  bottom  with  B  and  CD.  We  demonstrate  the  algebraic 
derivations  of  the  subproblems  and  note  that  we  can  successfully  use  the  defined 
operators  to  chronicle  the  mapping  of  gp  into  Q 3.  As  a  summary,  we  present  the 
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general  forward  transformation,  and  apply  our  decomposition  operators  to  networks 
in  Q * 


Early  references  on  nested  decomposition  arc  [Dan73],  [Gla73],  and  (IIo7'i).  Later, 
Abrahamson  (AbrS3)  and  Wiltrock  [Wit83]  enhanced  the  dual  version  or  Nested- 
Benders  Decomposition.  Here,  we  take  a  very  generic  view  of  nesting,  considering 
more  fashions  of  subproblcms. 

(A,\  (h'\ 

so  that  we 


MO 

(bl\ 

Before  starting,  let  us  redefine  (1.1)  with  A  = 

/I2 

and  b  — 

now  wish  to  find  an  oracle  for  solving 

\Aj,) 

U/ 

min 

x>0 

S.t.  1<\ 

*3 


T 

C*Z  = 


/tjz  >  6j 
AiX  >  b2 

/lax  >  63, 


(2.1) 


with  its  corresponding  network  having  one  node  and  no  arcs, 


9\  —  ({0}i7ri  Uira  Ux3,x,U). 

The  last  chapter  covered  the  two  cases  of  applying  S  and  03  to  gx  to  generate 
<70  and  <7 jj.  In  turn  we  will  new  apply  the  same  operators  to  <70  and  <70.  The  main 
difference  between  splitting  the  node  in  <71  and  one  in  either  <7 0  or  <7 g  is  that  the  two 
latter  types  have  incident  arcs.  What  do  we  do  with  these  incident  arcs?  In  the  next 
two  lemmas,  we  adopt  the  convention  that: 

When  a  node  in  <70  is  split  using  3 ,  those  arcs  once  incident  to  the  split 
node  will  be  made  incident  to  the  new  top  node. 

This  convention  makes  communication  networks  have  tree  structures.  Splitting  a 
bottom  node  extends  a  branch  of  the  network,  while  splitting  the  top  node  starts  a 
new  branch. 


Lemma  2.1  (B  on  the  Bottom  Node)  The  D-IV  operator  can  be  applied  to  the 
bottom  node  using  the  expression 

!fs  <7i  □[*■!, [7r2,7r3]], 
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where 

93  =  ({l,3l4},v<ilT3lT3,i,x)xJ{(13)1(34)l(31)l(43)}ldown,  down,  up,  up). 


Proof:  Let  [TiJa’a.Jra])  be  a  partition  of  the  original  row  index  set  %.  Decompose 
the  LP  formulated  in  (2.1)  into  three  'mbproblcms  using  two  applications  of  the  D-W 
operator.  The  resulting  subproblems  will  exhibit  a  linear  communication  structure 
as  in  Figure  2.2.  The  dotted  supernode  2  represents  an  implicit  subproblem  that  has 
itself  been  decomposed  into  nodes  3  and  4 


Figure  2.2:  Spitting  the  bottom  node. 

The  first  application  of  the  operator  groups  the  bottom  two  indices  together  in 
the  second  partition: 

0iB[*i,*aUx3]  =92, 
where  the  resulting  subproblcms  are 


min 

Xl 

cfxi  =  z, 

h>0 

s.t.  0\  : 

sji,  =  1 

(2.2) 

: 

O 

11 

H 

I 

: 

AiXi  >  6i, 

min 

8\W2  =  22 

rj>0,W2 

S.t.  t>2  ‘ 

WlX2-\-S\W2  >  —Oi 

(2.3) 

*2  : 

A2x2  >  bi 

x3 : 

A3X2  >  63, 

and 
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and  g2  =  ({1, 2}, tti, U  *3,2, x,  {(12), (21)}, down,  up).  The  second  application  of 
the  D*SV  operator  uses  the  partition  (xa,  X3)  to  slice  horizontally  through  (2.3).  The 
resulting  subproblem  formulations  are 


S.t.  Vi 


Sxw2  ~  z2 


Vi :  iffjzi +5jiea  >  -01 

Oi :  g3h  =  1 

&  :  AVj— /x2  =5  0 

5r2 :  AiX z  >  l>2 , 


min  62W3  =  z3 

ri>0,uii 

s.t.  u3  :  ^aX3+52w3  >  -02  (2-5) 

*■3  :  /I3X3  >  63, 

with  the  V2  constraints  included  in  the  top  partition. 

The  final  network  that  corresponds  to  the  subproblcm  triple  (2.2)(2.4)(2.5)  is  g3 
and  can  be  compared  to  that  pictured  in  Figure  2.2.  I 

By  reordering  the  nesting  we  just  used,  we  induce  a  different  communication 
pattern  from  the  one  above.  When  we  split  the  top  node  with  B ,  we  spawn  a  new 
branch  in  the  network. 


Lemma  2.2  (B  on  the  Top  Node)  The  D-W  operator  can  be  applied  to  the  top 
node  using  the  expression 

3i  B[ltfi,*2l,irrtj  =  0a> 

where 


03  =  ({3,4,l},7r,,7r2,?T3,2,x,2,  {(34),  (32), (43), (23)}, down,  down,  up,  up). 


2.1.  NESTED  DECOMPOSITION 


37 


Proof:  Apply  the  partition  ((xijXjj.xa)  to  (2.1)  to  get  the  system  of  subproblcms 
(2,G)(2.7)(2.5),  where 


mm 

(t>0 

ij5o 

Vi 

s.t.  0\ 
0* 

fa 

*i 


crxi  -  r, 


=  1 

S&  =  1 
AV,  -/x,=  0 
A3/3  —  Ix\  —  0 

/tjXl  >  &j, 


and 


mm 


(2-6) 


(2.7) 


ilWj  =  Zi 

s.t.  u2 :  5 1  >  — 0j 

*2  :  AjXj  >  &2, 

with  the  accompanying  communication  network  53  as  shown  in  Figure  2.3.  I 


Figure  2.3:  Splitting  the  top  node. 


The  term  cross  splitting  will  be  used  to  describe  the  process  of  nesting  Benders 
decomposition  within  D-W  decomposition,  and  vice  versa.  We  use  the  following 
definition  to  identify  such  a  condition. 

Definition  2.3  (Cross  Splitting)  A  node  is  cross  split  when  the  B  operator  is 
applied  and  it  has  an  incoming  right  arc,  and  similarly,  when  the  ID  operator  is 
applied  and  it  has  an  incoming  down  arc. 

We  will  not  cross  split  added  constraints  and  variables  that  function  as  partial 
representations  of  the  feasible  regions  of  still  other  subproblems  (incoming  up  and 
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left  arcs).  This  possibility  would  take  us  beyond  the  scope  of  this  thesis.  However, 
the  extra  subproblem  data  that  implement  objective  or  right-hand  side  modifications 
arc  considered  valid  places  for  cross  splitting.  For  instance,  it  is  valid  to  partition  the 
columns  of  the  D-W  bottom  subproblcm  (1.5)  but  not  the  top  one  (1.4).  Figure  2.4 
illustrates  the  communication  network  that  results  from  cross  splitting  the  bottom 
node  in  gp . 


Figure  2.4:  Cross  splitting  the  bottom  node. 


First,  we  repeat  the  LP  fomulalion  used  for  the  Subproblcm  Interface  Theorem: 

min  cfz  i  +  cTzj  =  z 

*l|0  ‘  ^ 

s.t.  xi :  +  AuXi  >  bi  (2*$) 

*2  :  >42iij  +  A22X7  >  bit 
and  its  accompanying  network  is 

9i  =  ({0}t,  Ur2,Xj  Uz2,0). 

Figure  2.4  completes  the  depiction  of  all  networks  in  G3  that  can  be  generated 
from  gp.  The  changes  made  to  gp  to  get  this  network  do  not  follow  our  previous 
convention  of  making  arcs  once  incident  to  the  split  node  incident  now  to  the  new 
left  node.  Because  we  have  cross  split,  the  arcs  in  question  must  be  duplicated  and 
made  incident  to  both  new  nodes.  This  is  evident  from  the  subproblem  formulations 
and  an  application  of  the  forward  transform. 

Lemma  2.4  (CD  on  the  Bottom  Node)  The  Benders  operator  ID  can  be  applied 
to  the  bottom  node  using  the  expression 


(9i  B  [ffi ,  *2))  B  (zi  ,  z2]  =  <73, 
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where 


53  =  (0.3,4},X,,X2,*3,*:  UXj.Xj.Xj, 


{(13),  (M),  (31),  (‘*1),  (23),  (32)}, down,  down,  up,  up,  right,  left). 


Proof:  First  wo  create  modified  versions  of  the  top  and  bottom  subproblcms  on 
which  to  demonstrate.  To  do  this,  apply  D-W  decomposition  to  (2.8)  as  the  following 
expression  suggests’ 

</j  =  5iB(xj,xj), 


where  <72  =  ({1,2},ti,tj,xjUxj,xjUxj,  {(12),  (21)}, down,  up).  The  top  and  bottom 
subproblcm  formulations  are: 


min  cfxn  -f  c£r,2  =  r0 

*»T-ra 

s.t.  0 :  g$l  =1 

rpx:  A'j  ,/-/x„  =0  (2-9) 

rf>2  :  A'j2  /  —  Jx  15  =  0, 

Xj  :  ^ii^n  + /lulu  >  6j, 


and 


min 

U'JIlU'JI 

m>0,rjj>0 

s.t.  Uj  : 
v2 : 
*2  : 


Sw  j  +  Sw  j  =  z2 

$  xji  +  Swx  >  —0 

$2x22  +  Sw2>0 

A2\X2i  +  A22X22  >  6j. 


(2.10) 


Note  that  some  liberty  was  taken  in  the  formulation  of  (2.10)  by  implementing  the 
objective  modification  with  two  added  variables  and  constraints  instead  of  one  of 
each. 
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Continue  by  crossing  Benders  decomposition  on  (2.10)  with  the  partition  (x2i,x22). 
The  left  and  right  subproblcms  are 


mm 

yi,h 


Swi  +  t  —  -21 


s.t.  Ui  :  vfzn+Swi  >  — 02 

x2,  :  /l2,z21  -  Iy  =  &2 

At :  fl22y  +  722*  >  0, 


min  —  fu22  Swi  =  r22, 

s.t.  y22 :  +  £u>2  >  0 

jt22  :  yu22  +  /l22x22  >  0 

w22 :  d«22  =  d, 

where  u>i  has  followed  z2i,  and  u>2  has  followed  x22. 


(2.11) 


(2.12) 


To  summarize  this  cross  splitting  example,  we  began  with  the  linear  program  (2.S), 
applied  Dantzig- Wolfe  decomposition  to  obtain  the  subproblcm  system  (2.9)(2.10), 
then  applied  Benders  to  the  bottom  subproblcm.  This  was  made  possible  by  express¬ 
ing  the  added  structure  for  objective  modification  with  multiple  constraints,  instead 
of  a  single  one.  The  final  communication  network  is  g$  and  is  shown  in  Figure  2.4. 
The  final  subproblcm  system  is  (2.9)(2.11)(2.12). 

The  following  corollary  formally  notes  that  the  other  networks  in  Q3  are  duals  of 
the  three  above. 


Corollary  2.5  (Nested  Duality)  Theorems  2.2,  2.1  and  2.f  apply  also  to  splitting 
the  nodes  of  gg  with  the  words  top  and  bottom  replaced  by  left  and  right,  and  switching 
B  with  ffl  and  row  partitions  with  column  partitions. 

Proof:  This  statement  follows  from  the  network  duality  theorem.  I 

In  conclusion,  the  operations  on  go  and  gg  have  demonstrrted  how  to  split  top, 
bottom,  left  and  right  nodes.  The  distinguishing  feature  of  these  nodes  was  that 
they  had  incident  arcs  from  either  above  or  below,  but  not  both.  We  look  forward  to 
having  our  operators  applied  to  any  node  in  a  network. 
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2.2  Characterizing  Q3 

As  \vc  have  stated  earlier,  the  previous  three  operations  on  the  network  go  arc  defined 
to  be  the  only  valid  ones.  Table  2.1  enumerates  the  six  mappings  from  Q 2  to  £3,  and 
cigurc  2.5  has  them  drawn  out. 

The  following  are  the  descriptions  for  column  headings  in  Table  2.1.  The  table 
represents  the  results  of  a  boolean  function  on  the  column  headings.  Each  heading 
can  take  one  of  two  values. 

First  Split  Uses:  Either  the  B  or  the  Q]  operator  is  applied  to  gy  to  get  either  go 
or  Qi ). 

Second  Split  Uses:  The  operator  used  for  the  second  slice. 

Second  Splits  Node:  The  node  split  on  the  second  slice.  We  use  number  1  to 
indicate  the  top  or  left  node,  and  the  number  2  to  indicate  the  bottom  or  right 
node. 


First 

Second 

Second 

Split 

Split 

Splits 

Uses 

Uses 

Node 

1 

B 

B 

1 

Valid 

2 

B 

B 

2 

Valid 

B 

m 

1 

Invalid 

3 

B 

ra 

2 

Valid 

CD 

B 

1 

Invalid 

4 

m 

B 

2 

Valid 

5 

m 

m 

1 

Valid 

6 

m 

m 

2 

Valid 

Table  2.1:  The  Elements  of  Q3. 
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2.3  Characterizing  QN 


By  characterizing  all  decomposition  schemes  that  arc  attainable  with  our  defined  set 
of  operators,  we  will  be  able  to  state  a  parallel  decomposition  oracle  that  holds  over 
the  entire  space.  First  we  need  to  handle  one  more  case  for  splitting  nodes,  then  we 
can  show  that  our  operators  arc  well  defined  for  any  node  by  demonstrating  their 
validity  on  the  most  general  ease.  We  need  to  define  a  “middle"  node  and  how  to 
split  it. 


Definition  2.6  (Middle  Node)  /I  node  n  €ff  is  a  middle  node  if  it  has  incoming 
and  outgoing  up  or  left  arcs. 


Our  convention  on  incident  arcs  remaining  incident  to  the  new  top  node  means  that 
splitting  a  middle  node  adds  a  new  branch  to  the  network. 


Lemma  2.7  ( B  on  a  Middle  Node)  The  D-  W  operator  can  be  applied  to  a  middle 
node  with  no  incoming  left  arcs. 


f 
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Proof:  We  start  by  redefining  the  LP  (2.4)  by  the  partition  /l2 


r  /taN 

\  If 22  / 


■  ca  • 


to  arrive  at 


S.t.  02 


Next,  the  expression 


Si  ui>  =  z-i 

i>fx2+Si  U)2  >  —Oi 

alh  =  1 

X2I2  —Ix<x  —  0 
A21X2  >  l>n 

A22X2  6j2* 


0iB([*i  ,  fan,  *22]),  *3] 


(2.13) 


suggests  that  0(2.13)  can  in  turn  be  performed  by  the  D-W  Method  using  0(2.14) 
and  0(2.15),  where 


s.t.  o2 
O2 


6\  to2  =  ~2 
i>'(x2  +  S1IU2  >  —Oi 


(2.M) 


-T  1 
3  22‘22 

A22/22  —  2 

/121X2 


mm 

xjj.WM 


S22W22  =  £22 


S.t.  t’22  :  ^  22^22  +  S22W22  >  —02 


7T22  :  j4222:2 


>  f>22- 


(2.15) 


The  forward  transform  gives  us  the  communication  network  in  Figure  2.6. 
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Figure  2.G:  Splitting  a  middle  node. 

In  conclusion,  our  networks  get  longer  from  the  bottom  and  bushier  from  the  top  and 
the  middle. 

By  proving  a  partial  ordering  property,  we  can  use  terms  like  upper,  above,  lower, 
and  below  as  relations  between  nodes.  The  >  symbol  is  used  to  indicate  an  ordering 
between  two  nodes. 

Property  2.8  (Partial  Node  Ordering)  Then  is  a  partial  ordering  on  the  nodes 
of  networks  in  QN ,  following  the  rule  that  n,  >  nj  if  there  is  an  up  arc  from  n 5.  to 
ni.  Node  n  1  is  said  to  be  above  nj. 

Proof:  There  is  a  directed  graph  of  up  and  left  arcs  that  spans  the  nodes  of  g  G  QN 
and  it  has  no  directed  circuits.  Hence,  it  induces  a  partial  order  on  up  arcs.  I 

We  now  summarize  the  forward  transform  from  Chapter  1  as  a  seven-step  process. 
The  first  step  defines  a  node  for  each  subproblcm,  and  the  other  steps  define  the  other 
four  tuples  in  a  communication  network  according  the  subproblem  indices:  Hn  and 

C„. 

Theorem  1  i  (Generalized  Transform)  The  transfonn  from  a  system  of  N  sub¬ 
problems  to  a  communication  network  with  N  nodes  is  a  seven  step  process: 

1.  For  each  subproblcm,  define  a  node  in  the  set  Af. 

2.  For  each  node  n  G  Af,  define  the  elements  of  the  row  index  set  Tin  =  'Em  n  %. 

3.  For  each  node  n  EAf,  define  the  elements  of  the  column  index  set  Cn  =CnCC. 
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.{.  For  each  subproblcm  nx  €  Af  containing  constraints  indexed  by  0  and  and  for 

every  other  subproblcm  n2  €  Jf,  that  provides  information  for  those  constraints, 
define  an  arc  (n2n i)  €  A  of  type  T„ini  =  up. 

5.  For  each  subpi-oblcm  nx  $Af  containing  constraints  indexed  by  v,  and  for  every 
other  subpwblcm  n2  £  Af  that  provides  information  for  those  constraints,  define 
an  arc  (n2  n>)  €  A  of  type  %ini  =  down. 

6.  For  each  subproblcm  n  j  £  Af  containing  variables  named  y  and  t,  and  for  every 
other  subproblcm  n2  €  Af  that  provides  information  for  those  columns,  define 
an  arc  (n2  nx)  €  A  of  type  Tnjni  =  left. 

7.  For  each  subproblcm  nx  €  Af  containing  variables  named  u,  and  for  every  other 
subproblcm  n2  £  Af  that  provides  information  for  those  columns,  define  an  arc 
(n2m)  6  A  of  type  T„ini  =  right. 

Proof:  First,  the  theorem  holds  for  schemes  involving  only  D-W  decomposition, 
since  we  know  that  the  transform  is  correct  for  a  top,  bottom,  or  middle  node,  as 
already  demonstrated.  Second,  by  network  duality,  the  theorem  holds  for  schemes 
involving  only  Benders  decomposition.  Finally,  when  both  types  of  decomposition  arc 
represented  in  the  same  network,  we  can  transform  a  node  with  adjacent  horizontal 
and  vertical  arcs  because  the  arcs  have  independent  effects  on  the  formulation.  The 
added  variables  and  constraints  of  a  subproblcm  interact  only  through  their  incidence 
to  the  original  primal  and  dual  variables,  x  and  ir.  I 

Definition  2.10  (Inverse  Operator  on  QN)  The  inverse  operation  on  gN  is  de¬ 
fined  as  one  being  reversible  by  a  senes  of  applicatio7is  of  the  D-W  and/or  Benders 
opemtors. 

QN  DAfm  -¥  gN~W\+\ 

where  |Af*|  <  N.  For  the  specific  networks  Qi  £  QN  and  gx  £  £AH^*I+1^  t0  iaf.c  an 
inverse  using  any  Afm  C  Af^,  collapse  all  the  nodes  into  one,  and  redefine  all  of  the 
incident  arcs  as  follows: 
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/.  M\  i —  Mi  — •  M'  4*  {n},  n  M\ , 

2.  Rn  —  Urt|€AT* 

3.  Cn  —  Un,  &V*  ^nj  > 

f-  A\  *-  Ai  -Uni€//'*{(n>  n2)}  “  Uni€,V*  {(nl  ”2)} 

4*  U(rn  nj)€^j  {(,l  ^2)}  4*  U(nj  nj)€^j  {(nl  n)}» 

««€//■*  nj€.V* 

5*  Tn\  n  "  ^(^1  *0  6  A\  and  (hj  H j)  G  «4i) 

where  we  have  used  4-  and  —  to  mean  set  union  and  subtraction.  Note  that  we  will 
not  allow  sets  to  contain  duplicate  elements. 

The  inverse  operation  is  defined  in  terms  of  being  reversible.  We  now  give  two 
conditions  on  the  set  of  nodes  to  collapse  M',  that  ofier  this  feature.  First,  define  the 
following  terms: 

up  connected  nodes:  For  some  graph  g  €  GN>  the  nodes  in  the  subset  M‘  $M  arc 
said  to  be  up  connected  if  and  only  if  for  all  nx,ni  in  M *  there  exists  .an  nj,nj 
undirected  path  on  up  arcs  that  visits  only  nodes  in  M *. 

left  connected  nodes:  The  dual  of  a  network  on  up  connected  nodes. 

Lemma  2.11  (  □  for  Connected  Nodes)  If  for  some  network  g  6  GN,  the  nodes 
in  M’  €  M  are  either  up  or  left  connected ,  then  the  effects  of  the  inverse  operation 

g'  =  gOM * 

can  he  nversed  by  a  series  of  D-W  or  Benders  operations,  respectively. 

Proof:  By  induction  on  the  number  of  nodes  in  Mm. 

1.  Show  it  for  |Af*|  =  2.  Take  from  a  network  g  gG**  two  nodes  nx  and  which 
are  up  connected.  We  have  shown  in  Chapter  One  the  inverse  operation  used  on 
the  networks  in  £?2.  This  step  is  reversible  because  when  splitting  the  aggregate 
node,  incident  arcs  can  be  replaced  to  their  original  positions  by  choosing  the 
proper  partition. 
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2.  Assume  that  the  operator  holds  for  \Af'\  =  N  -  1. 

3.  Show  it  for  |Af*|  =  N.  Take  a  set  of  up  connected,  nodes  Af'  where  |Af*|  =  N. 
We  have  shown  that  any  two  up  connected  nodes  in  Af*  can  be  joined  into 
one,  thus  reducing  the  order  of  the  network  by  one  node.  By  the  induction 
assumption,  the  operator  then  holds  for  any  Af  \ 


I 

Lemma  2.12  (  □  for  Unconnected  Nodes)  For  some  netuioi'kg  G  QN  and  some 
subset  Af*  =  {»|,h2}  C  Af,  ni  »2  are  both  up  connected  to  n3,  then  the  effects  of 
the  inverse  operation 

g1  -gOAf‘ 

ait  icvcrsiblc  by  a  series  of  splitting  operations. 

Proof:  We  use  a  bidirectional  sequence  of  inverse  and  splitting  operations  on  net¬ 
works  in  Qx  to  £73  to  show  that  the  necessary  node  configurations  can  be  achieved. 
For  larger  networks,  any  arcs  not  incident  to  these  three  nodes  arc  left  unaffected, 
by  design.  Incident  arcs  arc  either  between  the  three  nodes,  in  which  case  they  arc 
covered  by  the  operators  on  £f3,  or  they  pass  outside  the  three  nodes,  in  which  case 
their  sources  and  destinations  within  the  three  nodes  can  be  set  by  choosing  the  split¬ 
ting  partitions  properly.  TYansforms  from  one  network  to  another  and  back  again  are 
shown  in  Figure  2.7  and  explained  below. 

□  :  Collapse  node  2  into  node  1. 

□ :  Collapse  node  3  into  node  12. 

B:  Split  node  123  with  a  partition  of  the  rows. 
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4S 


□  □  B 

Figure  2.7:  Proof  of  moving  up  arcs. 


Figure  2.8:  The  generic  node. 

Definition  2.13  (Generic  Node)  /Is  pictured  in  Figure  2.8,  the  generic  node  has 
incident  arcs  such  that: 

1.  incoming  Up  arcs  have  Sources  in  Afusi 

2.  incoming  Down  arcs  have  Sources  in  Afpsi  «nd 
2.  incoming  Right  arcs  have  Sources  in  Afns, 

4-  outgoing  Up  arcs  are  Destined  for  Mud- 

5.  outgoing  Down  arcs  arc  Destined  for  nodes  in  A/ddi 

6.  outgoing  Left  arcs  arc  Destined  for  nodes  in  Uld> 

and  no  others. 

Lemma  2.14  (Generic  Node  of  QN)  Each  node  and  its  incident  arcs  of  a  network 
in  qn 

can  be  described  within  the  structure  of  the  generic  node  or  its  dual. 
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Proof:  By  induction  on  the  number  of  nodes. 

1.  The  lemma  holds  easily  for  networks  in  Qx  and  (y2. 

2.  Assume  that  the  lemma  holds  for  all  nodes  of  networks  in  QN . 

3.  By  the  definitions  of  the  B  and  CD  operators,  no  node  may  be  cross  split  if  it  has 
an  incoming  up  or  left  arc.  Only  the  HI  operation  may  be  used  on  the  generic 
node.  Therefore,  only  left  and  right  arcs  may  be  added  when  splitting  it,  which 
takes  the  network  from  QN  to  The  dual  holds  for  the  dual  of  the  generic 
node. 


I 

Lemma  2.15  (£?‘v  □A/’*  — >  £/'*)  Pick  a  iuhIc  n  in  a  nettuork  g#  6  •  Using  the  □ 

openttor,  this  network  can  be  reduced  to  the  four-node  network  in  Figut'C  2.9,  its  dual, 
or  some  special  ease  of  either. 

Proof:  It  is  sufficient  to  prove  that  the  connected  node  sets  of  the  generic  node  can 
be  reduced  to  the  three  nodes  so  that 

a)  Mud  =  Mps,  Nld  =  Arns>  Afus  =  Mod- 

b)  Aruo  ^  Afpo,  Afpo  -f-  Afusi  Arus  7s  A Cud,  and 

c)  j\ruo  —  Afi.o  =  {n/,o}i  A rus  =  {ut/s}i  and 

We  know  a)  holds  since  all  communication  networks  arc  symmetric;  for  every  arc 
(tij  n2)  there  is  a  corresponding  arc  (n2ni).  We  know  b)  holds  because  the  nodes  are 
partially  ordered.  We  know  t)  holds  because: 

•  any  nodes  on  left  arcs  will  collapse  to  either  one  node  in  an  up  tree,  or  the  two 
nodes:  npp  and  n,  and 

•  there  is  only  one  tree  on  up  arcs  containing  n  implying  that  Mud  and  Mus  can 
be  collapsed  to  one  node  each. 

i 
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We  now  define  the  generalized  Dantzig- Wolfe  operator  on  the  generic  node. 

Definition  2.16  (0  on  the  Generic  Node)  To  apply  the  B  operation  to  a  generic 
node  n  6  Afs,  from  a  network  <=  GN  by,  <7jv+ 1  55  9N  0  [P\,Pi],  wc  define  the 
transition  of  each  tuple  in  gw  to  that  in  <7w+i  • 

1.  Node  n  is  discarded  and  two  new  nodes  arc  added:  Wjv+j  =  jV/v»  —  {n}  +  {tjj,  n2}( 
where  ni,Ua 

2.  All  arcs  incident  to  node  n  at'e  discarded.  Of  those,  the  vertical  ones  arc  linked 
to  nodes  ntl  and  the  horizontal  ones  arc  duplicated  and  incident  to  both  nodes 
ri|  and  n2: 

>Wi  =  -  U  {('i'»)>(nn')} 

+  U  {(«' »•)},+  U  {(»i n%  if  ytt  e  Pi 

+  U  {(n'n3)}» 

n'Gtfus 

+  U  >7«s  P, 

+  U  {(»'».), (n'n,)} 

+  U  «"■"')} 

n'€A/yo 

+  U  {(nin'),(n2n')} 

n'GtfLD 

+  {(»»!  «2),(w2«l)}- 


3.  The  row  index  sets  for  nodes  n\  and  n2  arc  the  same  as  for  7iodc  n:  %nx  — 
Tlnj  =  Tin,  and  the  column  sets  for  new  nodes  arc  determined  from  the  column 
partition:  C„,  =  P\  and  Cnj  =  Pi. 

J,.  The  arc  types  of  the  icpositioncd  and  duplicated  arcs  stay  the  same,  and  the  two 
new  arcs  (rij  »2)  and  («2nj),  become  down  and  up  ones  I'espcdively: 


up,Vn"  e  Jfus,  (n"n  i)  €  .4^+1, 
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i  —  down,  Vn'  G  (u  nj)  G  ^+ii 
%"m  =  right,  V n"  <=  Afos,  (»" n»)  €  Av+i . 
%t  m«  =  up,  V n"  G  Art;o,  (n*  n")  €  .4^+1 , 
Tnin »  -  down,  V n"  G  A^o,  (»j  «*)  G  -4/v+i , 
7J,, n«  as  (eft,  V n"  6  ArLD,  (ni  JiM)  G  *4(v+i , 
7J,,ni  s=  down,  and 

^nj«l  = 


'Hie  main  theorem  of  this  thesis  involves  &  generalized  B  operation  on  nodes  of 
networks  in  GN.  We  first  prove  the  operation  on  a  close  cousin  of  the  generic  node, 
using  node  3  in  Figure  2.9. 

Lemma  2.17  ( B  on  G *)  The  B  operator  as  applied  to  the  middle  node  in  Figure  S.9 
is  a  special  ease  of  the  generalized  B  operator  on  the  generic  node. 


Figure  2.9:  The  4-nodc  generic  network. 

Proof:  Proof  by  comparison.  Take  the  case  where  the  subsets  of  connected  nudeJ  each 
have  one  element,  and  jVud  =  Nds  =  {2},  ffiD  =  Mrs  =  {1},  A fa  =  Afa»  =  {3}. 
We  will  call  this  network  g*.  It  has  the  following  specification: 

9*  =  ({1.2, 3,4,}, 


«’»«’! i ^2  U7r3>7r4,XI,Z2,X2,X2, 
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{(12),  (21),  (13),  (31),  (14),  ('ll),  (23),  (32),  (34),  (43)}, 
T\2  —  T\3  =  7)4  =  right,  7jj  =  %\  —  7J|  =  left, 

T73  ss  7^  =  down,  02  =  ^3  =  up). 


where  <r  ”  Tj  UxaUsraU^.  The  operation  on  this  network  is  g3  -  where 

3s  =  ({1. 2,5,6, 4,}, 

5T,  7T| ,  JTj,  JTj,  Xj ,  Ij,  Ij,  X2,  Xj, 

{(12),(21),(16),(51),(1C),(61),(1.|),(41),(25),(52),(56),(65),(54),(«)}, 
%7  —  %s  —  T\c  =  T\\  —  right,  %\  —  %\  =  %i  —  %\  =  left, 

=  TU  —  Tu  =  down,  7jj  =  %%  =  7*5  =  up), 


We  enumerate  the  steps  used  to  convert  into  <?$: 

1.  One  node  is  discarded  and  two  are  added:  Af7  =  M\  —  {3}  +  {5,6}. 

2.  All  arcs  incident  to  node  3  arc  discarded.  Of  those,  the  horizontal  ones  arc 
linked  to  nodes  5  and  6  according  to  the  column  partition,  and  the  vertical  ones 
arc  duplicated  and  incident  to  both  nodes  5  and  6:  A7  —  A\  —  {(13),  (31),  (23), 
(32),  (34),  (43)}  +  {(15),  (51),  (16),  (61),  (25),  (52),  (56),  (65),  (54),  (45)}, 

3.  The  row  index  sets  for  Nodes  5  and  6  arc  the  same  as  for  node  3:  7£j  =  "Re  = 
and  the  column  sets  for  new  nodes  arc  determined  from  the  column  partition: 
Cs  =  x7  and  Cc  =  £3. 

4.  The  arc  types  of  the  repositioned  and  duplicated  arcs  stay  the  same,  and  the 
two  new  arcs  (56)  and  (65)  become  down  and  up  ones  respectively. 
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2.4.  SUMMARY 

Theorem  2.18  (B  and  CD  on  QN)  The  B  operator  and  its  dual  CD,  as  defined  on 
the  generic  node,  cover  every  possible  transition  of  a  network  with  N  nodes  to  one 
with  N  +  1  nodes. 

Proof:  By  induction: 

1 .  Lemma  ( B  on  £/*). 

2.  Assume  B  up  to  QN. 

3.  Demonstrate  that  (/^B^Pa)  -♦  G***1  as  follows: 

•  Lemma  ( G **  □A/’*  -» C?4); 

•  Lemma  (B  on  G*)\ 

•  the  □  operator  is  defined  to  be  reversible,  which  implies  that  every  node 
besides  the  two  new  ones  n*  and  na,  and  every  arc  that  was  not  incident 
to  node  n  can  be  restored  to  its  prior  status  as  defined  by  gu; 

The  dual  argument  holds  by  network  duality.  I 


2.4  Summary 

We  arrived  at  the  begining  of  this  chapter  carrying  a  transformation  between  linear 
programs  and  communication  networks,  and  some  node  splitting  operators. 

•  We  proceeded  to  nest  the  operators  and  got:  node  ordering,  cross  splitting,  and 
lots  of  duality  through  the  choice  of  the  first  operation. 

•  The  networks  with  three  nodes  were  characterized  in  Table  2.1. 

•  The  □  operator  was  introduced  and  two  lemmas  about  collapsing  many  nodes 
into  one  are  proved. 
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•  The  generic  node  was  introduced,  and  two  lemmas  followed.  The  first  one 
showed  that  all  nodes  are  similar  to  it  or  its  dual.  The  second  one  showed  that 
all  networks  can  be  reduced  to  some  case  in  QA. 

•  The  generalized  forms  of  the  5  and  ID  operators  were  introduced,  and  then 
shown  to  carry  networks  first  from  QA  to  f/5,  and  then  from  Q**  to  Q1**1. 

Finally,  Theorem  2.18  acts  as  a  characteristic  mapping  from  one  network  to  those 
having  more  nodes.  Sometimes  there  is  more  than  one  path  can  be  taken  to  the  same 
network  (associativity),  and  in  addition,  the  inverse  operation  need  not  retrace  the 
actual  path  taken  to  create  the  network.  In  the  next  chapter  we  will  explore  the 
transformation  of  networks  into  subproblcms,  and  consult  a  parallel  oracle. 


Chapter  3 

Parallel  Decomposition 

ALMOST  daily,  researchers  in  the  technical  disciplines  envisage  new  and 
different  uses  for  parallel  computers.  Linear  programming  as  a  practical 
field  could  never  have  happened  were  it  not  for  the  invention  of  the  serial 
computer  [DanS7],  which  revolutionized  the  approach  to  complex  problems.  And 
now,  the  availability  of  parallel  computers  will  permit  the  next  quantum  expansion  in 
the  set  of  problems  that  can  be  solved.  The  parallel  decomposition  algorithm  will  be 
a  first  step  in  placing  mathematical  programming  in  league  with  other  technologies 
making  use  of  these  new  computers. 

We  view  the  ultimate  information  content  of  a  problem  formulation  as  the  solution 
to  the  problem.  To  obtain  the  solution,  we  consult  an  oracle: 

solution  =  0(  problem  ). 

Linear  program  solutions  consist  of  points  and/or  rays  of  the  primal  and  dual  feasible 
regions  of  the  problem.  A  typical  oracle  for  solving  linear  programs  is  the  simplex 
method: 


LP  solution  =  simplex(LP). 

This  thesis  is  concerned  with  substituting  various  decomposition  algorithms  for  the 
simplex  method.  The  decomposition  algorithms  are  governed  by  a  communication 
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network  between  LP  subproblcms.  Different  problem  structures  will  result  in  different 
networks  and  different  subproblcms.  However,  it  is  possible  to  define  a  single  general 
algorithm  having  the  communication  network  and  the  subproblcm  formulations  as  its 
parameters: 


LP  solution  =  decomposition  (  network,  subproblcms  ). 

The  goal  of  this  chapter  is  to  take  the  information  contained  in  an  LP  and  a 
communication  network,  to  produce  an  equivalent  set  of  information  in  the  form  of 
a  system  of  subproblcms  and  finally  to  find  the  LP  solution  using  a  parallel  oracle 
operating  on  this  equivalent  information. 


1  2  3 


E 

I 

a 

P 

s 

e 

d 

T 

i 

m 

c 


Read  Data 
93.1 


Form  Sub* 
93.2  •  3.3 


Process  Subs 
93.4 


.  Print  Solution 

J__  93.4 


Processor  Indices 


■  Work  done  serially 

■  Work  in  parallel 
□  Processor  Idling 


Optimality  (Equilibrium) 
Detected 


Figure  3.1:  Strings  of  work. 


Figure  3.1  lists  the  steps  of  parallel  decomposition.  Along  with  each  step  in  the  fig¬ 
ure  are  the  section  numbers  of  this  chapter  that  explain  the  step,  and  a  representation 
of  whether  the  step  is  done  in  serial  or  in  parallel. 
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Read  Data:  The  first  order  of  business  is  to  define  the  problem  we  wish  to  solve 
and  the  decomposition  scheme  used  to  do  it.  These  are  specified  by  the  original 
data  (/l,6,c)  and  the  communication  network  <7. 

Form  Subs:  The  reverse  transform  from  the  communication  network  into  a  system 
of  subproblems  is  covered  in  Sections  3.2-3.3.  The  information  obtained  in  the 
Read  Data  step  is  processed  in  parallel  during  this  step. 

Process  Subs:  The  parallel  processors  act  .as  information  carriers  over  the  network, 
performing  oracles  on  subproblcms  and  filtering  the  solutions  through  interfaces. 
A  relaxation  of  the  nested  oracle  procedure  is  shown  to  perform  C?(l.l). 

Print  Solution:  From  the  multitude  of  final  subproblcm  oracles,  we  must  construct 
one  for  0(1.1).  Because  the  subproblcm  formulations  contain  all  of  the  relevant 
subproblem  solutions,  this  is  a  simple  filtering  process  and  is  done  serially. 


3.1  Starting  Information 

In  the  following  discussion,  we  will  assume  that  our  linear  program  formulation  takes 
the  form  given  in  (1.1),  namely 

min  c'x  =  z 

x>0 

s.t.  :r :  Ax>b. 

3.1.1  The  Problem  Description 

We  can  break  down  a  problem  description  into  two  sets  of  information:  the  implicit 
information  (indices  and  variables)  and  the  explicit  information  (problem  data). 

Indices  will  play  an  important  role  in  the  discussions  of  problem  structure  and 
communicated  information.  Not  only  the  constraint  and  variable  indices  are  used, 


(3.1) 
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but  those  for  the  right-hand  side  and  objective  as  well.  Thus,  we  define  for  LP  {XI  >: 

<r  to  be  the  objective  index, 
s  to  be  the  right-hand  side  index, 

TZ  to  be  the  row  index  set,  and 
C  to  be  the  column  index  set. 

As  in  the  previous  chapter,  when  discussing  partitions,  we  will  use  the  same  sym¬ 
bols  for  the  names  of  the  primal  variables  as  for  the  index  sets  of  their  corresponding 
columns,  and  the  same  symbols  for  the  names  of  the  dual  variables  as  for  the  index 
sets  of  their  corresponding  rows.  Therefore,  %  =  ir  and  C  =  x  for  LP  (3.1). 

The  values  of  the  variables  lie  in  vector  spaces  that  arc  dimensioned  in  terms  of 
their  index  sets.  We  sec  for  LP  (3.1)  that 

x  the  primal  variables  lie  in  cR2,  and 
7r  the  dual  variables  lie  in 

Finally,  the  explicit  information  needed  to  give  substance  to  the  implicit  informa¬ 
tion  above  is  the  problem  data.  We  will  take  the  convention  of  positioning  this  data 
within  the  problem  by  specifying  its  indices.  For  instance  for  (3.1): 

A  the  constraint  matrix  is  indexed  by  (tt.x), 
b  the  right-hand  side  vector  is  indexed  by  (x,  $),  and 
c  the  cost  vector  is  indexed  by  (<r,x), 

This  completes  the  specification  of  a  linear  programing  problem 

3.1.2  The  Communication  Network  Description 

The  previous  chapter  explained  the  process  of  partitioning  the  row  and  column  index 
sets.  It  also  showed  how  communication  networks  result  from  this  process.  Rather 
than  operating  from  partition  information,  we  shall  assume  that  the  decomposition 
information  is  in  the  form  of  a  communication  network. 
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\Vc  repeat  the  definition  from  Chapter  One  for  completeness.  The  communication 
network  is  the  five-tuple, 

3  =  CAf,  Cnt  A , T0),  Vu  6  jV\  a  6  A, 

where  the  tuples  arc  defined  to  be 

Af  set  of  nodes, 

Tin  node  n’s  row  index  set, 

Cn  node  n’s  column  index  set, 

A  set  of  arcs,  A  =  {(«i  na)  G  Af11 :  there  is  an  arc  from  rij  to  nj}, 

Ta  the  type  for  arc  «  (up,  down,  left,  or  right). 

This  completes  the  description  of  the  decomposition  scheme  to  solve  LP  (3.1). 

3.2  Intermediate  Information 

Several  information  structures  arc  constructed  from  the  starling  information  in  order 
to  facilitate  the  formulations  of  the  subproblcms.  These  arc:  the  Incidence  Graph 
used  to  derive  subproblcm  interfaces,  the  An  Index  Sets  which  index  passed  informa¬ 
tion,  and  the  Partition  Graphs  which  identify  implicit  subproblcms  and  synchronized 
information. 

3.2.1  The  Incidence  Graph 

(/l,  6,  c)  — »  h 

The  incidence  graph  h  is  created  from  the  explicit  information  (/l,6,c).  It  is  bipartite 
with  one  class  of  nodes  over  the  objective  and  constraint  indices  and  the  other  class 
of  nodes  over  the  right-hand  side  and  variable  indices.  Two  nodes  are  connected 
(always  between  the  two  classes),  if  there  is  a  nonzero  entry  in  the  data  (/l,6,c), 


GO 
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corresponding  lo  the  two  indices  of  the  linked  nodes.  For  instance,  if 


-(::)■  -O'  *-( 


Au 

/la  i 


then  the  corresponding  incidence  graph  is  that  in  Figure  3.2,  and 


h  =  ({ff.TrnTra.s.it.XaJ.^o-z,)^-,  s),(-|  Xj)%  (tt2  x*),  («a  3t?2)})* 

The  nodes  of  h  correspond  to  aggregations  of  the  rows  and  columns  of  the  linear 
program  so  that  it  represents  the  incidence  between  blocks  of  coefficients.  Note  that 
there  is  n»  or  a  link  between  the  two  nodes  cr  and  s,  but  all  other  links  between  the 
two  classes  of  nodes  are  possible. 


Figure.  3.2:  An  incidence  graph. 


Future  research  along  these  lines  will  probably  concern  various  optimal  partition¬ 
ing  schemes,  based  on  the  coupling  between  subproblems  and  the  level  of  compulation 
needed  to  obtain  subproblcrn  solutions.  Some  good  references  on  incidence  graphs 
are  (Ros70,  Bun7G,  Tar7G]. 


3.2.2  Arc  Index  Sets 


(i7»M  C F-a\Ca ) 

Recall  from  the  Subproblcrn  Interface  Theorem  that  we  need  only  pass  a  selection  of 
a  subproblem’s  solution  over  any  given  arc.  The  selection  is  made  the  arc’s  index  set. 
We  represent  these  sets  as  follows: 
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•  if «  is  a  vertical  arc  (up  or  down),  it  has  a  row  coupling  index  set  7v*,  and 

•  if  a  is  a  horizontal  arc  (left  or  right),  it  has  a  column  coupling  index  set  C„. 

This  means  that  cither  rows  couple  partitioned  columns  or  columns  couple  partitioned 
rows.  An  arc  is  represented  by  two  nodes.  Thus  arc  index  sets  will  have  two  nodes 
as  subscripts.  Node  index  sets  have  only  one  node  for  a  subscript. 

Let  the  arc  (»t  n2)  be  horizontal;  then 

7^,  V  paths  jkl  in  the  graph  /*,  h  G  (7£„,  H  7 l»3)J  G  Cn|, /  G  Cn,}. 

Let  the  arc  (ri|  n a)  be  vertical;  then 

C„,  nj  =  (i  :  V  paths  ijk  in  the  graph  hj  G  (C„,  nCnj)  Us,f  G  7^,,^*  G  7£nj). 

Secondly,  the  theorem  says  that  according  to  given  interfaces,  the  objective  (right- 
hand  side)  values  must  appear  in  the  topmost  (leftmost)  subproblcms  containing  con¬ 
strained  variables  (non-vacuous  constraints),  respectively.  When  a  topmost  subprob- 
lcm  has  unconstrained  variables  or  a  leftmost  subproblcm  has  vacuous  constraints, 
the  theorem  also  says  that  an  incoming  up  or  left  arc,  respectively,  carries  the  value 
of  cTz  or  7r7i,  respectively. 

To  determine  in  general  whether  an  up  arc  ought  to  carry  crz  along  with  x  and 
whether  a  left  arc  ought  to  carry  along  with  7r,  follow  the  simple  rules: 

1.  up  arcs  carry  clx  if  there  arc  objective  coefficients  in  the  formulation  of  a  sub¬ 
problcm  below,  and 

2.  left  arcs  carry  if  there  are  right-hand  side  coefficients  included  in  the  for¬ 
mulation  of  a  subproblcm  to  the  right. 

In  other  words: 


if  Tnin2  =  up,  and  3 n3  <  n2  s.t.  c„3(j)  =  Cj  for  some  j  G  C„3 
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Left  arcs  will  carry  0  if  there  arc  right-hand  side  coefficients  included  in  the  formula¬ 
tion  of  a  subproblem  to  the  right: 

if  7^,,,,  =  left,  and  3«a  <  nj  s.t.  bnj(i)  =  hi  for  some  i  G 

then  71|(|(ij  7vh| n3  U  c r. 

3.2.3  Partition  Graph  Description 

Partition  graphs  identify  implicit  subproblems  created  by  cross-nesting  the  decom¬ 
position  operators.  They  arc  used  to  designate  what  information  must  be  synchro¬ 
nized  by  determining  how  A ,0,/,  and  L  are  indexed.  The  solutions  of  subproblems 
corresponding  to  the  nodes  in  a  partition  graph  define  the  solution  to  an  implicit 
subproblem.  One  that  is  not  solved  explicitly  because  it  was  decomposed.  Before 
formally  introducing  the  partition  graph,  we  first  define  the  following  three  graphs: 

a  vertical  graph  is  a  graph  with  all  vertical  arcs, 

a  horizontal  graph  is  a  graph  with  all  horizontal  arcs,  and 

a  subgraph  is  a  graph  s  =  (*4j,  written  s  C  y  where  g  =  (.4,  AO.  if  and  only  if 
Aft  C  Af  and  A,  Q  An  Aft. 

The  information  contained  in  the  communication  network  g  is  used  to  generate 
its  set  of  partition  graphs  Va  and  their  accompanying  row  and  column  index  sets  Tip 
and  Cp,  for  all  p  G  Va : 

g  {vai 72p,Cp)  Vn  G  Af,  n  G  .4,  p  G  Va, 

where 

Va  is  the  set  of  all  partition  graphs  in  g , 

Tip  is  the  row  index  set  for  partition  graph  p  G  Va,  and 
Cp  is  the  column  index  set  for  partition  graph  p  G  Va. 
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Definition  3.1  (Partition  Graph)  /I  partition  graph  p  is  a  horizontal  or  vertical 
subgraph  of  g  enated  by  cross  nesting  one  operator  within  another.  The  partition 
graphs  of  the  communication  network  g  an  contained  in  the  set  V3.  Each  graph  p  is 
a  collection  of  nodes  Afp  connected  by  arcs  Ap. 

For  all  p  —  (Arpi  Ap)  6  V3  we  define  row  and  column  index  sets  to  be  the  union  of 
the  node  index  sets  contained  in  the  graph: 

K?=  U  4=  U  cn- 

»€.\V  n€Ap 

Lemma  3.2  (Partition  Graph  Ordering)  Then  is  a  partial  ordering  on  the  par- 
tilion  graphs  V3  of  a  given  network  g  6  f/'v,  based  on  the  highest  ordend  node  con¬ 
tained  within  them. 

Proof:  There  is  an  ordering  on  the  nodes,  and  .all  partition  graphs  arc  maximally 
connected  on  horizontal  or  vertical  arcs.  Therefore,  no  partition  graph  can  be  a 
subgraph  of  another,  and  there  must  be  a  node  in  each  that  is  of  greatest  order.  Such 
nodes  from  different  partition  graphs  arc  different  and  in  turn  partially  ordered.  I 

Here  arc  two  properties  of  partition  graphs. 

Property  3.3  (Similar  Rows  or  Columns)  If  p  is  a  horizontal  partition  graph, 
Tin  Is  constant  for  all  n  €  Afp.  Likewise,  if  p  is  a  vertical  partition  graph,  Cn  is 
constant  for  all  n  €  Afp. 

Property  3.4  (Parent/Child  Incidence)  If  p  and  c  un  partition  graphs  and  p  is 
the  pannt  of  c,  then  if  p  is  vertical,  Cp  =  Cc  and  7 le  =  Tin ,  uihcre  n  =  Arp  fl  Afe. 
Likewise,  if  p  is  horizontal,  7lp  =  7^  and  Ce  =  C„,  when  n  —  Afp  PI  Afe. 

Take  as  an  example,  the  application  of  D3  on  the  bottom  node  of  the  D-W  network 
go-  The  two  partition  graphs  from  that  network  are  displayed  in  Figure  3.3.  Their 
row  and  column  index  sets  are 


Rp.  —  ~i  U”2,  Rpj  —  ~2, 


Gl 
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Figure  3.3:  Partition  graphs  from  splitting  the  bottom  node. 


0PS  =  z,  CP3  =  x. 

The  information  carried  by  the  up  arcs  from  }h  to  node  1  must  be  synchronized  and 
added  to  (2.9)  as  a  single  column.  When  formulating  this  subproblem,  we  purposely 
included  a  single  set  of  variables  l  with  which  to  take  combinations  of  these  columns. 


3.3  Forming  Subproblems 

We  concern  ourselves  now  with  a  philosophical  question  on  the  information  contained 
in  a  linear  program  specification,  and  how  to  obtain  that  information  from  a  commu¬ 
nication  network  in  order  to  fully  specify  the  subproblcms  used  in  a  parallel  oracle. 

The  following  discussion  concerns  the  dichotomy  of  structure  and  content.  Trans¬ 
lated  to  mathematics,  this  terms  become  symbols  and  meaning. 

Definition  3.5  (Symbolic  Representation)  An  object  is  represented  symbolically 
by  the  members  of  its  structure  and  their  mlations  to  each  other. 

Lemma  3.6  (Symbolic  Linear  Program)  A  symbolic  repmsentation  of  a  linear 
prog  mm  is  contained  in  7 Z  andC  and  an  assumed  slandartl  form  (3.1). 

Since  subproblem  n  a  linear  program,  its  symbolic  information  consists  of  7^,,  and 
C„  and  an  assumed  standard  form. 

Theorem  3.7  (Necessary  Information)  The  following  information  is  required  to 
obtain  a  solution  to  a  linear  program:  a  symbolic  representation  in  the  form  of 
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row  and  column  indices  7 Z  and  C  for  the  default  formulation  (3.1),  problem  data 
in  the  form  of  (A,btc)  for  (3.1),  and  an  oracle. 

To  define  the  reverse  transform,  the  indices  of  each  linear  program  subproblcm  are 
obtained  from  the  communication  network  by  identifying  their  names  and  subscripts, 
and  determining  their  dimensions.  The  result  is  a  symbolic  representation  of  each 
subproblcm.  Together  with  a  description  of  the  problem  data  and  the  simplex  method, 
we  can  perform  the  oracle  on  any  subproblem. 

Definition  3.8  (Symbolic  Subproblcm*)  The  reverse  transformation  is  an  ex¬ 
traction  of  the  symbolic  representation  of  the  subproblems  from  the  communication 
network.  I  Me  define  it  as  a  two  step  process: 

Index  Sets  The  subpivblctn  index  sets  are  defined  in  Tables  3.1  and  3.3  as  a  trans¬ 
lation  from  the  node  index  sets  and  the  arcs  entering  the  node.  Each  index  has 
two  parameters:  its  subscript,  and  its  dimension. 

Default  Formulation  Tables  3.3,  3.f,  3.5,  and  3.6,  comprise  the  standard  subpmb- 
lem  formulation,  defined  in  terms  of  the  incidence  between  the  subproblem’s  row 
and  column  indices.  The  standard  subproblem  formulation  is  summarized  in 
Table  3.7. 

3.3.1  The  Formulation  Procedure 

We  follow  a  procedure  of  determining  the  subproblem  index  sets,  which  then  deter¬ 
mines  the  default  formulation.  From  the  position  of  a  subproblcm’s  corresponding 
node  in  the  communication  network  (c.g.  topmost  or  leftmost  etc.)  we  can  determine 
the  partition  of  the  original  data  over  the  set  of  subproblems. 

Original  Variables:  The  variables  xn  and  appear  in  a  subproblem  based  on 
the  node  index  sets.  If  Tin  is  not  empty  then  jrn  appears.  If  Cn  is  not  empty  then 
x„  appears.  It  is  possible  for  one  to  appear  and  not  the  other.  These  results  are 
summarized  in  Table  3.1. 
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Dimension  Subscript  Appears 


X 

n 

if  7^0 

I 

c„ 

n 

ifCB*0 

Tabic  3.1:  Original  variables. 

Original  Data:  Independent  portions  of  At  b,  and  c  will  be  used  to  define  subprob* 
leins  based  on  their  node  index  sets: 


An={A{J:ieKJeCn}. 


Lemma  3.9  (Placement  of  b  and  c)  The  right-hand  side  coefficients  indexed  by 
i  €  Tin  for  node  n  arc  placed  as  follows: 


.  f  6;  if  n  is  maximal  such  that  ( ij )  6  A  for  some  j  6  C„, 
to  otherwise. 

The  objective  coefficients  indexed  by  j  €  Cn  for  node  n  are  placed  as  follows: 


c„ 


cj 

0 


if  n  is  maximal  such  that  ( ij )  €  A  for  some  i  6  Tin, 
otherwise. 


Proof: 

1.  Begin  with  full  arc  index  sets  and  leftmost  and  topmost  placement  of  b  and  c, 
respectively. 

2.  By  the  Subproblcm  Interface  Theorem,  we  redefine  the  arc  index  sets  of  those 
down  and  right  arcs  (n  n‘)  incident  to  topmost  and  leftmost  nodes  n  €  jV  and 
thus  move  down  all  c„(j) :  j  6  C  —  Cnn> ,  and  right  all  bn(i ) :  i  €  %  —  7£nn«. 

3.  For  each  down  arc,  there  is  a  corresponding  up  arc  that  must  have  its  index 
set  augmented  by  a  for  the  objective  row  if  a  node  below  contains  any  original 
objective  coefficients. 
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*1.  Steps  2  and  3  can  be  repeated  as  often  as  necessary  to  achieve  the  result  in 
Lemma  3.9. 


Original  data  arc  indexed  by  the  node  index  sets.  Both  the  data  and  their  indices 
carry  the  same  subscripts.  These  results  are  summarized  in  Table  3.2. 


Indices  Subscript  Appears 


n 

if  Tln  X  Cn  ^  0 

b 

1 

n 

if  7^0 

c 

1 

n 

ifCW0  | 

Table  3.2:  Original  data. 


Incoming  Arcs:  The  added  variables  and  added  data  of  a  subproblcm  arc  those 
other  than  the  originals.  Their  appearance  in  the  formulation  is  governed  by  the 
incoming  information,  i.c.,  the  incoming  arcs.  They  form  structures  for  handling  the 
information  as  it  arrives,  placing  it  into  the  formulation  so  that  it  will  have  the  proper 
efTcct. 

An  ambiguity  arises  here.  Information  transported  along  up  (left)  arcs  is  used 
to  form  additional  rows  (columns)  in  the  formulation.  If  there  is  more  than  one  up 
or  left  arc,  there  can  be  a  choice  as  to  how  the  information  gets  incorporated  into 
the  formulation  that  is  not  specified  in  the  communication  network.  That  choice, 
for  adding  columns,  has  to  do  with  the  number  of  convexity  constraints  to  keep. 
One  is  sufficient,  but  more  than  one  will  give  the  region  being  approximated  greater 
resolution.  Our  default  choice  will  be  for  the  latter. 

When  the  incoming  up  arc  has  its  source  in  a  different  partition  graph,  there 
is  no  choice;  there  must  be  one  convexity  constraint  for  each  such  partition  graph. 
This  forces  the  information  from  each  partition  graph  to  be  coordinated  into  one  new 
column.  Likewise,  for  left  arcs  the  default  will  be  to  add  individual  constraints,  and 
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only  when  the  arc’s  source  lies  in  a  different  partition  graph  will  we  add  only  one 
constraint  for  all  the  information  arriving  from  that  graph. 

In  the  following  descriptions  of  added  variables  and  data,  we  adopt  the  convention 
that  the  generic  incoming  arc  to  node  n  is  a  =  (ni,u). 


Added  Variables:  All  added  variables  arc  subscripted  by  the  incoming  arc  that 
generated  them,  except  for  the  case  when  the  source  of  the  incoming  arc  is  in  a  child 
partition  p.  The  variables  A,  0,  /,  and  t  should  then  be  subscripted  by  p.  This  causes 
a  single  primal  or  dual  convexity  constraint  to  be  created  for  each  child  partition  as 
required.  Table  3.3  shows  which  added  variables  arc  affected  by  synchronization. 
The  dimensions  A'„,  and  Kp  arc  defined  as 

I\n  :  the  number  of  solutions  broadcast  by  subproblcm  n  G  Afy 
I\v  :  the  number  of  solutions  broadcast  by  partition  graph  p  G  V, 

where  the  term  broadcast  refers  to  the  practice  of  communicating  a  subproblcm  solu¬ 
tion  over  the  outgoing  arcs  of  the  corresponding  node. 


subscript  appears  when  incoming  Dimension 


a  or  p 

left  arc 

A'n,  or  Kp 

V 

a 

down  arc 

1 

rp 

a 

up  arc 

Ca 

lj 

a 

right  arc 

1 

0 

a  or  p 

up  arc 

1 

l 

a  or  p 

up  arc 

A'n,  or  Kp 

u 

a 

right  arc 

1 

y 

a 

left  arc 

Ka 

w 

a 

down  arc 

1 

l 

a  or  p 

left  arc 

1 

Table  3.3:  Added  variables. 
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Added  Data;  These  structures  are  generated  by  incoming  arcs.  Information  car¬ 
ried  by  up  and  left  arcs  is  accumulated ,  whereas  that  carried  by  down  and  right  arcs 
is  overwritten.  This  important  point  separates  standard  LP  decomposition  from  the 
class  of  totally  symmetric  algorithms.  One  method  proposed  to  overcome  this  is  to 
incorporate  a  proximal-point  penalty  term  into  the  objective  function  [Roc76,  Gol$6, 
I3eTS9).  Table  3.4  shows  which  data  are  afTcctcd  by  synchronization.  The  indexing  of 
each  data  item  positions  it  with  respect  to  the  subproblcm  variables  and  constraints. 
When  an  added  variable  is  subscripted  by  an  incoming  arc,  the  incident  data  is  sub¬ 
scripted  by  the  reuer.se  arc.  The  reverse  of  arc  «  —  (rii,u)  is  a  =  (n,rii).  When  an 
added  variable  is  subscripted  by  a  partition  graph  />,  the  incident  data  7  and  </,  arc 
subscripted  by  p  also.  The  incident  data  fl  and  X  are  subscripted  by  the  incoming 
arc.  The  indexing  then  defines  a  single  block  of  constraints  or  columns,  since  one  is 
subscripted  by  the  arcs  and  the  other  by  p. 

We  now  ofTcr  word  descriptions  of  the  added  data  presented  in  Table  3.4: 

7a  :  the  optimality  indicators  for  the  dual  solutions  that  arc  passed  over  arc  a, 

</a  :  the  optimality  indicators  for  the  primal  solutions  that  arc  passed  over  arc  a, 
Ha  :  the  tmnslalcd  dual  solutions  that  arc  passed  over  arc  a, 

A'a  :  the  translated  primal  solutions  that  are  passed  over  arc  «, 

Is  :  a  matrix  that  translates  dual  (primal)  solutions  passed  over  up  (left)  arc  a. 

The  passed  information  is  either  placed  directly  into  the  formulation  of  the  destina¬ 
tion  subproblem,  or  incorporated  into  an  existing  structure.  For  down  and  right  arcs, 
only  the  latest  solution  is  used.  The  new  information  is  written  directly  over  the  old 
and  appears  in  the  formulation  as  or  j/j.  For  up  and  left  arcs,  the  information 
is  accumulated,  and  appears  as  an  expandable  structure  in  the  formulation  of  the 
destination  subproblem.  Each  new  piece  of  information  causes  the  row  or  the  column 
dimension  of  the  structure  to  increase  by  one,  and  so  these  dimensions  are  indexed  by 
the  number  of  times  the  source  subproblem  has  been  solved.  Specifically,  for  k  G  Kn 
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Table  3.4:  Added  data. 


cu,s 


3.3.  FORMING  SUBPROBLEMS 


71 


and  arc  a  -  (nx  ,n), 


1  k^1  solution 

0  otherwise, 


V  =  &,!■■]  SC.), 

nj  =  {*:„• :  i  s  n.). 


to  n\  is  optimal, 


The  following  matrices  arc  permutation  matrices  (not  necessarily  square).  Their 
entries  serve  to  translate  the  primal  (dual)  solution  of  an  up  (left)  arc  into  the  rows 
(columns)  of  the  destination  that  arc  coupled  to  the  source.  For  a  given  vertical  arc 
a  =  (ni.u), 


if  the  j,h  clement  of  the  set  7^,  is  t, 
0  otherwise. 


If  a  is  horizontal,  then 


if  the  ith  clement  of  the  set  Ca  is  j, 
otherwise. 


Non-negativity:  Variables  restricted  to  be  non-negative  and  others  not  sign  re¬ 
stricted  arc  given  in  Table  3.5.  The  original  variables  z„  can  be  either  non-negative 
or  free.  The  default  is  free.  When  n  is  a  bottommost  node,  x  >  0  with  a  non-vacuous 
original  column.  It  is  true  that  xn  could  be  non-negative  in  subproblcms  that  arc  not 
bottommost,  but  it  is  sufficient  that  the  condition  hold  in  any  one  subproblcm.  We 
chose  the  bottommost  one  to  guarantee  that  it  has  at  least  one  extreme  point. 


Index  Setting 


/a 

>0 

«a 

>o 

Xn 

free  or  >  0 

Va 

free 

Wa 

free 

U 

free 

Table  3.5:  Non-negativity. 
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Constraint  Types:  The  types  for  constraints,  whether  equality  or  inequality  arc 
given  in  Table  3.6.  There  arc  two  choices  for  the  jrn  corresponding  to  the  primal 
constraints.  The  default  is  equality.  If  n  is  a  rightmost  node,  the  constraint  is  an 
inequality  as  shown.  Similar  to  determining  non-negativity  for  xn,  all  7r„  constraints 
could  be  inequalities  but  it  is  sufficient  for  only  those  in  rightmost  subproblcms  with 
non-vacuous  original  constraints. 


Index 

Setting 

A„,  A  p 

> 

va 

> 

=  or  > 

= 

Wa 

> 

o*A 

= 

Table  3.6:  Constraint  types. 


3.3.2  Summary 

The  variable  and  data  information  tables  arc  partially  summarized  in  Table  3.7. 
Subproblem  formulations  arc  derived  from  this  standard  form.  Given  a  node  n  and 
all  its  incoming  arcs,  e.g.,  a  =  (nj,n),  this  table  will  generate  one  subproblcm  in  the 
schema  of  the  communication  network. 

The  most  interesting  feature  of  Table  3.7  is  its  symmetry  with  respect  to  the 
relations  between  Dantzig- Wolfe  and  Benders  decomposition.  The  Greek  and  Roman 
symbols  are  interchanged  by  taking  the  transpose.  Another  feature  to  notice  is  that 
the  series  of  entries  A'a,  j/a,  t/>a,  and  fla  are  all  added  data  structures  to  handle  passed 
information  with  entries  in  real  space,  while  the  series  of  entries  <73,  r/a,  — 7a,  — 7a, 
5a,  7a  are  added  structures  with  entries  in  binary  space.  The  entries  in  the  second 
series  serve  as  indicators  of  what  functions  their  corresponding  real  space  information 
will  serve,  and  how  they  will  impact  the  subproblem  formulation.  Together,  both 
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U 

ua 

Va 

tua 

u 

A, 

ria 

74 

= 

0 

Va 

6i 

> 

-Oi 

flu 

An 

-h 

=/> 

bn 

A'a 

—/a 

= 

0 

d a 

ss 

di 

oa 

ol 

= 

1 

>0 

IV 

o 

frcc/>  0 

free 

fixe 

free 

cr,x 

0 

1 

C,  i 

0  ! 

h 

\  1  I 

Tabic  3.7:  Template  for  generating  subproblcms. 


series  arc  a  constellation  of  structures  bordering  the  original  delta  An  like  the  planets 
orbiting  the  sun,  each  bringing  to  bear  its  own  fundamental  force  on  the  central  mass. 

3.4  The  Parallel  Oracle 

In  this  section  we  assume  that  we  have  available  to  use  a  collection  of  decomposition 
subproblcms  that  arc  an  equivalent  symbolic  representation  of  some  original  LP  for¬ 
mulation.  When  coupled  with  data  values  and  an  oracle  we  can  obtain  a  solution  to 
the  original  LP. 

When  we  finish  solving  a  subproblem  in  a  decomposition  scheme  it  is  well  known 
that  any  neighboring  subproblcm  on  the  network  is  now  eligible  to  receive  the  solution 
for  the  purpose  of  updating  its  formulation.  The  discussion  that  follows  comes  from 
a  very  simple  idea: 

Why  not  solve  all  of  the  neighboring  subproblems  at  the  same  time? 

Thus,  we  will  modify  the  nested  oracle,  which  was  designed  to  work  between  two 
problems  (a  Master  and  a  Slave).  There  are  two  steps: 

•  enlarge  the  set  of  communicable  information  to  include  interior  points  of  the 
Slave,  but  keep  it  finite,  and 


74 


CHAPTER  3.  PARALLEL  DECOMPOSITION 


•  broadcast  information  instead  of  having  only  two-way  conversations, 

where  we  intend  the  term  broadcast  to  mean  that  a  node  communicates  information 
over  its  outgoing  arcs  to  all  of  its  neighbors  in  a  communication  network.  The  first 
step  completely  blurs  the  distinction  of  Master  and  Slave,  and  the  second  suggests 
using  a  parallel  computer.  The  proof  of  the  oracle  is  in  terms  of  the  validity  of  the 
above- two  relaxations  of  the  nested  oracle. 

Definition  3.10  (Relaxed  Oracle)  When  consulted,  the  relaxed  oracle  Or(-)  pro¬ 
vides  either: 

•  a  primal  feasible  point  x  or  a  dual  feasible  point  tc, 

•  a  feasible  primal  my  x,  or 

•  a  feasible  dttal  ray  H, 

where  this  information  is  taken  from  a  finite  set  that  includes  all  extreme  points  and 
r.xtmmc  mys. 

Lemma  3.11  (Relaxed  Oracle)  The  finitcncss  aryument  for  the  D-W  method  is 
not  inhibited  by  a  substitution  of  the  relaxed  omclc  OrO  for  the  regular  omclc  O(-) 
in  Steps  3  and  3. 

Proof:  A  review  of  that  argument  will  show  that  the  information  communicated  up 
and  to  the  left  between  subproblcms  has  two  essential  properties: 

•  the  information  comes  from  a  finite  set; 

•  the  finite  set  includes  all  extreme  points  and  rays. 

I 

Lemma  3.12  (Broadcasting  Information)  The  pmcticc  of  broadcasting  subprob- 
lem  solutions  does  not  inhibit  finite  convergence  of  the  D-  W  method. 


Proof:  The  proof  is  simple.  Broadcasting  does  not  alter  the  set  of  communicable 
information  when  the  relaxed  oracle  is  used.  E 


3.4.  THE  PARALLEL  ORACLE 


75 


The  following  is  a  corollary  to  the  Reverse  Transform  Theorem  that  will  be  referred 
to  for  direction  in  the  Parallel  Oracle. 

Corollary  3.13  (Subproblem  Modifications)  Arc  types  govern  the  types  of  mod¬ 
ifications  made  to  their  destination  subproblcms  as  follows: 

up  arc  add  a  column, 
down  arc  modify  the  objective  function, 
left  arc  add  a  row, 
right  arc  modify  the  right-hand  side. 

The  Overall  Solution  Lemma  tells  how  the  solution  of  (3.1)  is  constructed  from 
the  individual  subproblcm  solutions. 

Lemma  3.14  (Overall  Solution)  The  primal  and  dual  solutions  (x,t)  to  (3.1)  are 

*  -  U  nnd  5 “  U  *«• 

n£\ri  n€-Vl/ 

where  Mi,  are  the  leftmost  nodes  of  g  and  Mu  are  its  topmost  nodes. 

Proof:  By  induction  on  the  levels  of  partition  graphs. 

1.  The  lemma  is  true  for  any  vertical  or  horizontal  partition  graph: 

•  The  lemma  is  true  for  the  D-W  and  Benders  Methods. 

•  Assume  the  lemma  is  true  for  a  partition  graph  with  l  levels. 

•  Use  D-W  or  Benders  method  on  the  rightmost  or  bottommost  two  sub¬ 
problems  of  a  partition  graph  with  /  +  1  levels  and  reduce  the  number  of 
levels  to  /. 

2.  Assume  the  lemma  is  true  for  /  levels  of  partition  graphs. 

3.  If  another  level  of  partition  graphs  is  added  to  the  network,  it  will  be  to  the 
right  or  below,  leaving  the  solution  still  at  the  top  and  leftmost  nodes.  So  the 
lemma  must  be  true  for  /  +  1  nodes  also. 


I 


76 


CHAPTER  3.  PARALLEL  DECOMPOSITION 


The  statement  of  the  parallel  oracle  is  based  on  the  premise  of  independent  work 
units  we  will  call  jobs.  Our  work  units  arc  modifying  and  solving  subproblems,  so  there 
is  a  one-to-one  correspondence  between  jobs  and  subproblcms.  Jobs  arc  submitted  to 
be  serviced  by  any  processor,  and  held  pending  until  one  becomes  available.  We  use 
the  term  non-pending  in  the  theorem  to  refer  to  those  jobs  that  arc  not  wailing  to  be 
processed;  either  running  or  not  submitted. 

Theorem  3.15  (Parallel  Oracle)  This  procedure  performs  0(3.1).* 

I.  Formulate  all  of  the  subproblcms  { 1 , . . . ,  yV }  and  submit  a  job  for  each  one. 

3.  Repeat  the  following  until  there  arc  no  more  jobs: 

•  Get  a  job  with  its  associated  subproblcm  n. 

•  Use  the  Subproblcm  Modification  Lemma  to  determine  what  modifications 
to  make  to  the  subproblcm  based  on  any  new  information. 

•  Consult  the  oracle  0(n). 

•  If  the  oracle  does  not  rcjKat  the  same  solution  then  bmulcast  it  and  submit 
a  job  for  each  non-pending  neighbor. 


Proof:  We  need  to  show  that  solutions  provided  by  O(-)  for  any  subproblcm  will 
satisfy  the  restrictions  for  information  passed  over  arcs.  These  restrictions  arc  defined 
by  nesting  the  relaxed  oracle  0T.  The  proof  is  by  induction  on  the  number  of  levels 
of  partition  graphs. 

1.  Infornitition  passed  up  and  left  from  one  partition  graph  to  another  satisfies 
Or(-): 

s  The  oracle-provided  solution  to  a  D-W  Master  problem  always  satisfies  the 
conditions  for  the  relaxed  oracle. 

•  Assume  that  for  vertical  partition  graphs  with  /  levels  that  oracle  provided 
solution  of  the  topmost  node  satisfies  the  conditions  for  the  relaxed  oracle. 
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•  Take  a  vertical  partition  graph  with  t  +  1  levels.  The  non-lopmost  nodes 
implicitly  represent  a  D-W  Slave  problem  and  they  themselves  satisfy  the 
relaxed  oracle  by  the  induction  step.  The  topmost  node  must  also  satisfy 
the  relaxed  oracle  by  the  Mastcr/Slavc  relation  and  by  the  fact  that  a  top 
node  in  a  two-node  scheme  also  satisfies  the  relaxed  oracle. 

Take  the  dual  of  the  above  argument  to  prove  the  ease  for  left  arcs. 

2.  Assume  the  lemma  is  true  for  /  levels  of  partition  graphs. 

3.  Take  a  network  with  /  -h  1  levels  of  partition  graphs,  where  the  first  one  is 
vertical.  The  second-level  graphs  provide  solutions  to  the  first  level  that  satisfy 
the  relaxed  oracle  by  the  induction  step.  By  Step  1,  the  first  level  must  also, 
and  thus  all  /  -f  1  levels. 


I 

This  chapter  has  been  an  outline  of  how  to  implement  parallel  decomposition 
from  start  to  finish.  We  assumed  that  the  work  of  defining  a  communication  network 
was  already  done,  and  that  the  remainder  of  the  work  was  to  form  subproblems  and 
execute  the  parallel  oracle.  One  subtle  point  was  made  in  the  Overall  Solution  Lemma, 
and  that  is  that  it  is  a  relatively  simple  matter  to  construct  the  overall  solutions.  By 
more  traditional  methods,  this  is  often  a  tricky  exercise  in  data  management. 

Finally,  the  simple  loop  of:  Listen,  Modify,  Evaluate,  and  Rivadcast  is  our  gerbil 
on  a  treadmill,  which  together  with  many  others  like  it,  arc  more  powerful  than  the 
strongest  workhorse;  and  faster  loo.  We  will  see  this  conclusion  supported  in  the 
results  of  the  next  chapter. 


Chapter  4 


Results  for  Staircase  Linear 
Programs 

DOES  parallel  decomposition  make  effective  use  of  the  machine  it  is  designed 
to  exploit?  A  Fortran77  program  which  solves  Staircase  Linear  Programs 
was  written  to  find  a  practical  answer  to  this  question.  This  code  has  run 
on  two  different  shared-memory  multiprocessing  computers:  a  Sequent  Balance  8000, 
and  an  IBM  3090/600E.  Preliminary  results  on  the  Sequent  computer  were  reported 
in  [EntSS].  More  extensive  results  on  the  IBM  3090  will  be  reported  here.  The  parallel 
algorithm  is  inherently  message  based.  As  a  result,  the  shared-memory  implemen¬ 
tation  actually  simulates  a  mcssagc-passing/distributcd-mcmory  parallel  computer, 
using  the  Intel  iPSC  subroutine  library  as  a  standard  interface. 

Naturally,  the  decomposition  code  must  solve  linear  program  subproblems.  This 
is  accomplished  by  calling  MINOS  5.1  [MS87]  as  a  subroutine  (Ent87].  Likewise,  the 
best  comparison  of  the  decomposition  method  is  to  solve  the  same  test  problems  us¬ 
ing  MINOS  as  a  stand-alone  system.  This  approach  allowed  many  implementation 
differences  to  be  eliminated,  and  permitted  the  merits  of  decomposition  and  paral¬ 
lel  decomposition  alone  to  be  discussed.  The  tests  of  the  decomposition  code  were 
designed  to: 
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•  produce  results  from  which  to  judge  the  merits  of  parallel  decomposition, 

•  investigate  the  algorithm’s  performance  under  different  parameter  settings, 

•  provide  performance  extrapolations  outside  the  set  of  test  problems,  and 

•  outline  the  Current  limitations  of  the  code  and  areas  for  improvement. 

Emphasis  is  placed  on  demonstrating  that  decomposition  .and  added  processors 
provide  faster  solutions,  with  acceptable  accuracy. 

Our  presentation  of  computational  results  is  based  on  the  suggested  standards  of 
[CDM79]  and  [JBNPS9],  In  addition,  several  similar  presentations  were  considered, 
including  [IIicS2,  IILSIb).  Section  4.1  covers  the  theoretical  basis  of  the  computer 
code,  and  its  software  implementation.  Section  4.2  gives  details  of  the  experimental 
apparatus  and  presents  results  that  support  the  appropriateness  of  parallel  decom¬ 
position  on  staircase  problems.  Finally,  the  conclusions  section  argues  the  case  for 
parallel  decomposition  in  general  And  traces  directions  for  future  work  in  the  field. 


4.1  General  Information 

Method:  Staircase  subproblems  were  formed  and  solved  on  an  “as  available”  basis 
using  p  processors.  Subproblems  are  considered  available  when  they  have  just  received 
new  information  from  an  adjacent  node  on  the  network.  When  a  subproblem  finishes 
optimal,  both  the  primal  and  dual  solutions  arc  communicated.  When  infeasible, 
only  the  dual  solution  is  broadcast,  and  when  unbounded,  only  the  primal  solution 
is  broadcast.  No  dual  optimal  solution  can  be  sent  until  one  has  been  received, 
except  for  rightmost  subproblcms.  As  a  result,  the  Phase  I  algorithm  (for  obtaining  a 
primal  feasible  solution)  is  a  serial  one.  Computational  results  exhibit  this  property. 
Also,  the  results  show  parallel  decomposition  outperforming  the  simplex  method  on 
problems  having  more  than  2000  nonzero  entries. 


so 
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V. 

n-1  n  n+J 

Figure  4.1:  Dimensions  of  &  step. 

Memory  Space:  The  size  of  the  Fortran  code  and  the  amount  of  memory  required 
for  data  storage  arc  parameterized  in  the  following  terms: 

N  =5  #  subproblems  (max  N  is  N  —  51), 

P  —  #  processors  (max  p  is  p  =  20), 

r„  = s  #  coupling  rows  for  arc  a  (max  r«  is  r.  =  300), 

r„  =  #  nonzero  rows  in  n’s  partition  of  /I  -  r(„<n+  j), 

Cn  —  #  columns  in  n’s  partition  of  /l, 
en  =  #  nonzeros  in  n’s  partition  of  A, 
r n  =  #  rows  in  subproblcm  n, 

Cn  —  #  columns  in  subproblcm  n,  and 
cn  =  #  nonzeros  in  subproblem  n. 

The  maximum  values  for  N ,  p  and  ra  are  given  for  the  lest  configuration.  Figure  4.1 
represents  the  pattern  of  nonzero  coefficients  near  the  nth  partition.  The  lengths  of 
the  bold  lines  show  the  dimensions  of  r„,  r„(K+j,  and  c„  for  this  partition.  Closed-form 
equations  for  the  subproblem  dimensions  (f„,  £„,£„)  are 

=  *”n  d*  ^n,n— 1  "4*  f”n,n+l> 

=  Cn  *f*  r’n.n+l , 

Cn  =  en  +  r*n+1  +  2rn,„_i. 
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Given  the  values  of  jV,  an  expression  for  the  total  amount  of  shared  memory 

used  by  the  program  can  be  calculated  as 

#  bytes  =  49CA'  +  16p  +  lG^Vr*  +  S  £(cB(1.25  +  ‘1.5rn/c«)  +  17.5r„  +  6.75c„). 

n 

Software:  The  computer  code  presented  here  is  based  on  MINOS  5.1.  It  uses 
MINOS  in  its  entirety,  with  a  few  extra  routines  spliced  in  here  and  there.  MINOS 
consists  of  three  basic  modules:  Input,  Solve,  and  Output.  The  parallel  decomposition 
algorithm  has  two  additional  modules.  The  first,  Form  Subs,  is  inserted  after  the 
MINOS  Input  module,  and  the  second,  Process  Subs,  governs  parallel  MINOS  Solves 
and  decomposition  message  handling.  In  addition,  a  small  amount  of  extra  work  is 
involved  in  collating  the  many  subproblcin  solutions  into  one  overall  solution  before 
they  arc  Output.  Thus,  the  Input/Output  work  is  slightly  greater  for  decomposition. 

MPS  and  SPECS  Input  Files:  These  files  arc  input  using  the  MINOS  Input 
Module.  The  standard  MPS  file  is  input  to  determine  the  Problem  Data.  It  is 
assumed  that  this  MPS  file  describes  an  LP  that  has  a  block  diagonal  or  staircase 
structure.  Normal  MINOS  input  files  arc  sufficient  to  sclve  the  LP  as  a  single  large 
problem.  However,  to  decompose  a  block  diagonal  structure  into  n  subproblcms, 
additional  information  must  be  provided  in  the  DSPECS  file. 

DSPECS  Input  File:  This  file  contains  the  additional  information  needed  to  com¬ 
plete  a  staircase  decomposition  linear  program.  An  example  of  such  a  file  is: 

0  Debugging  Parameter 

50  %  of  extra  rows  to  add  to  each  subproblcin 

100  %  of  extra  columns  to  add  to  each  subproblem 

3  number  of  subproblcms 

20  30  number  of  rows  and  columns  in  the  first  subproblcm  (optional) 

20  30  number  of  rows  and  columns  in  the  second  subproblem  (optional) 

20  30  number  of  rows  and  columns  in  the  third  subproblem  (optional) 

4  number  of  processors  (actually  specified  in  JCL) 
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Since  this  is  strictly  Benders  decomposition  on  a  staircase  system,  the  number  of  sub* 
problems  N  equals  the  number  of  nodes,  and  Af  =  A*},  A  —  {(1,2), (2, 1),. . ., 

(Af  - 1,  iV),  (iV,  N  - 1)},  %  =  left,  if  a  =  (n,  n  - 1)  for  some  n  e  A'',  and  7^  =  right,  if 
a  ss  (n  - 1,  n)  for  some  n  €  AC  This  means  that  the  entire  communication  network  is 
defined  in  terms  of  N  and  |C„|,  Vn  €  AC  The  sets  7ln  and  Cn  are  well  defined  given 
the  number  of  columns  in  each  partition  and  the  fact  that  the  elements  of  R  and  C 
are  ordered. 

Output  Files:  Each  processor  has  a  standard  Fortran  output  file,  and  therefore  the 
MINOS-type  iteration  log  of  each  subproblcm  solved  by  each  processor  will  appear 
in  the  corresponding  file.  In  addition,  the  root  process  appends  decomposition  and 
parallel  computation  summary  statistics,  and  the  overall  LP  solution  in  its  standard 
output  file.  The  solution  has  the  same  format  as  MINOS.  Finally,  one  short  summary 
file  is  written  by  the  root  process  that  also  contains  the  summary  statistics. 

Forming  Subproblems 

The  serial  version  of  this  module  was  first  documented  in  (EntS7).  Chapter  3  described 
for  the  most  general  ease,  how  to  form  subproblcms.  The  staircase  version  implicitly 
assumes  a  specific  communication  graph  as  in  Figure  4.2  and  that  no  row  or  column 


Figure  4.2:  Linear  communication  network  for  staircase  pattern. 


permutations  are  necessary  to  obtain  a  staircase  pattern  in  A.  Before  the  Form 
Subs  step,  the  subproblems  are  dimensioned  based  on  these  implicit  assumptions.  No 
intermediate  information,  as  outlined  in  Chapter  3,  need  be  used.  A  simple  heuristic 
is  used  to  provide  subproblem  dimensions.  The  object  of  the  heuristic  is  to  partition 
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the  ordered  columns  of  the  matrix  into  a  user-specified  number  of  sets  n.  The  columns 
in  each  set  are  adjacent,  and  they  arc  chosen  so  as  to  minimize  the  number  of  coupling 
rows  between  the  columns  of  adjacent  sets.  First  a  profile  function  /(•)  :  C  -*  72.  is 
calculated,  where 


/O')  =  €  72 :  Ay  f  0}  -  m\n.{i  €  72 :  Ay  ^  0}. 

Then  N  —  1  local  minima  arc  found  so  that  the  distances  between  them  arc  nearly 
the  same.  It  is  .also  desirable  to  get  the  local  minima  as  small  .as  possible.  If  the 
problem  has  not  enough  steps  to  supply  N  subproblcms,  a  warning  is  printed,  and 
the  number  of  steps  found  is  used. 

Processing  Subproblems 

When  the  solution  of  a  neighboring  subproblcm  arrives  at  the  mailbox  of  a  given 
subproblcm,  an  independent  job  is  defined.  (Independence  between  jobs  means  that 
they  can  be  executed  concurrently.)  Messages  from  different  subproblcms  can  be 
handled  by  a  single  job  as  described  in  Table  4.1. 


Receive  Mcssagc(s) 
Make  Modification(s) 
Perform  Oracle 
Broadcast  Solution 
Die 

Table  4.1:  The  life  of  a  job. 


Jobs  are  serviced,  in  the  order  they  were  made  available,  by  any  available  proces¬ 
sors.  Messages  from  different  subproblcms  can  be  handled  by  a  single  job  as  described 
in  Table  4.1,  which  lists  the  four  associated  steps.  The  first  round  of  jobs  may  skip 
the  first  two  steps  if  there  are  no  messages  to  receive. 
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Since  this  is  an  implementation  of  Benders  decomposition,  if  a  primal  solution 
is  received,  the  RHS  is  modified.  If  a  dual  solution  is  received,  a  new  constraint  is 
added. 

The  oracle  is  performed  by  a  subroutine  call  to  (he  Solve  Module  of  MINOS.  A 
pointer,  passed  as  a  parameter,  directs  MINOS  to  the  proper  data  set,  and  in  return, 
some  solution  form  is  provided  according  to  the  oracle's  definition. 

Different  information  is  broadcast  under  the  differing  exit  conditions  of  the  Solve 
Module. 


When  optimal,  the  primal  extreme  point  is  passed  over  an  outgoing  right  arc  (if 
one  exists),  and  the  dual  extreme  point  is  passed  over  an  outgoing  left  arc  (if  one 
exists)  only  if  there  is  already  an  extreme  point  in  the  present  subproblcm’s  extra 
constraints,  or  it  is  rightmost.  This  guarantees  dual  feasibility  of  the  information 
passed  on  left  arcs. 


>n  unbounded,  in  addition  to  the  primal  extreme  ray,  a  primal  extreme  point 
adcast  over  an  outgoing  right  arc  (if  it  exists).  Since  MINOS  is  an  implementa¬ 
tion  of  the  simplex  method,  the  extreme  point  is  available  and  used. 


When  infeasible,  the  dual  extreme  ray  is  passed  left.  In  this  situation,  the  par¬ 
allel  decomposition  algorithm  is  actually  serial,  because  only  one  new  job  is  created 
from  that  finishing.  Some  test  problems  spend  much  of  the  time  with  infeasible  sub- 
.  voblcms.  An  extreme  example  is  SC205,  which  has  only  a  single  nonzero  objective 
coefficient  in  the  leftmost  subproblem.  This  makes  all  but  the  leftmost  subproblem 
feasibility  problems:  we  need  only  find  a  feasible  point  because  the  objective  is  vac¬ 
uous.  The  decomposition  algorithm  can  be  made  parallel  by  passing  an  infeasible 
primal  solution  to  the  right,  but  this  information  must  not  be  relied  on  as  part  of  an 
overall  solution.  At  the  time  of  this  writing,  we  have  yet  to  implement  this  feature. 
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Convergence  and  the  Termination  Criterion 

Dual  solutions  arc  extreme  points  of  the  dual  feasible  region  of  the  neighbor  and 
therefore  finite  in  number.  If  a  dual  solution  corresponds  to  a  non-binding  constraint, 
the  job  is  not  executed.  Eventually,  no  new  jobs  will  be  created;  at  this  point,  an 
optimal  solution  has  been  found. 

To  test  whether  a  constraint  will  be  binding,  the  objective  value  of  the  subproblcm 
n 2  that  sent  the  dual  extreme  point  is  compared  with  the  value  of  the  variable  ta  in 
subproblcm  rij  where  a  —  (nj,U|)  is  the  arc  that  carried  the  message.  Since  ta  is  a 
lower  bound  on  the  value  of  rnj,  if 


*nj  -  to  <  fof, 

then  the  constraint  will  be  non-binding.  The  value  of  tol  is  the  default  feasibility 
tolerance  used  by  MINOS. 

Discarding  Constraints 

Typically,  a  large  number  of  constraints  will  be  added  to  a  given  subproblcm.  How¬ 
ever,  .not  all  of  them  arc  necessary  to  obtain  an  optimal  solution.  At  most  \'Ru\  can 
be  binding  at  the  final  solution.  We  actually  keep  |7£a|  4-  2  for  good  measure.  The 
decomposition  code  overwrites  the  added  constraints  that  are  no  longer  binding.  It 
replaces  the  constraint  that  has  been  slack  for  the  greatest  number  of  solves. 

Communication 

Messages  contain  a  quantity  of  information  that  is  a  function  of  the  number  of  cou¬ 
pling  constraints  r„  between  the  communicating  subproblcms.  Table  4.2  gives  the 
lengths  of  each  message  type  in  bytes.  The  maximum  message  length  is  16*(3+S6)  = 
1424  bytes  for  all  the  test  problems. 

Sending  a  message  involves  loading  it  into  a  bufTcr  and  copying  the  buffer  into  the 
proper  mailbox.  Receiving  a  message  involves  copying  it  from  the  proper  mailbox 
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Message  Bytes 

Primal  Point  8  *  (3  +  ra) 

Primal  Point  and  Ray  16  *  (3  +  r0) 
Dual  Point  or  Ray  8  *  (4  -1-  ra) 

Tabic  <1.2:  Message  sizes. 


into  a  buffer.  Subproblems  have  one  mailbox  lor  each  incoming  arc.  Each  mailbox 
is  capable  of  holding  only  one  message.  If  a  new  message  arrives  before  the  old 
one  is  read,  the  old  one  is  discarded.  Discarding  messages  in  this  fashion  does  not 
affect  finite  convergence  (but  according  to  [HSLSS],  it  is  possible  for  such  retained 
information  to  be  used  to  speed  convergence). 


Basis  Factorization 

MINOS  maintains  a  basis  factorization  that  is  updated  by  the  decomposition  code  as 
appropriate  after  each  modification  to  a  subproblcm.  The  routines  for  this  purpose  arc 
in  the  software  package  called  LUSOL  and  are  documented  in  [GMSW86].  As  a  result 
of  making  both  row  and  column  updates,  the  factorization  needs  to  be  recalculated 
only  when  it  becomes  inaccurate  or  too  large.  The  default  settings  from  MINOS  arc 
used  to  govern  refactorization. 


4.2  Testing 

The  following  experiments  were  performed  to  test  the  performance  of  parallel  decom¬ 
position  algorithms.  A  test  suite  of  twenty-two  staircase  linear  programs  were  solved 
with  different  partitions  and  different  numbers  of  processors.  The  conclusions  are 
that  the  algorithm  is  consistently  well  behaved  in  its  use  of  additional  processors  and 
that  it  outperforms  the  serial  algorithm  (the  simplex  method)  in  most  cases,  when 
using  only  four  processors. 
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4.2.1  The  Test  Environment 

VVc  report  the  test  environment  so  that  the  interested  reader  may  reproduce  the  same 
conditions  on  a  variety  of  parallel  machines. 


Language 

Fortran  77  with  IBM  Parallel  Fortran  Extensions 

Compiler 

IBM  Parallel  VS  Fortran  with  VS  Fortran  V2  Rcl  l.I 

Compiler  Options 

No  Vector,  No  Parallel,  Optimize  Level  3, 

Dynamic  Shared  Common 

Computer 

IBM  3090/G00E, 

2G bytes  shared  virtual  memory,  and 

12SMbytcs  real  extended  memory. 

Operating  System 

MVS/XA  V2.2.0 

Code  +  Local  Common 

0.62  Mbytes 

Shared  System  Common 

0.27  Mbytes 

Shared  Data  Common 

1.60  Mbytes 

Total  Shared  Common 

1.87  Mbytes 

Total  Memory 

2.49  Mbytes 

Tolerances 

MINOS  Defaults 

Message  Passing 

w/o  locks 

Job  Flow  Control 

with  locks 

The  processors  were  aligned  after  dispatch  with  a  barrier. 


4.2.2  The  Test  Suite 

All  but  three  of  the  twenty-two  staircasc-linear-program  test  problems  were  chosen 
from  a  collection  of  fifty-three  used  by  Lustig  [Lus87]  in  a  performance  evaluation  of 
the  simplex  method.  Included  in  his  report  are  pictures  of  the  patterns  of  nonzeros 
for  the  test  suite:  see  Appendix  B. 

Table  4.3  lists  the  LP  dimensions  for  the  test  suite.  The  problems  arc  ordered 
by  the  number  of  nonzeros.  All  but  three  are  part  of  a  set  of  test  problems  made 
available  by  Gay  [Gay85]  and  distributed  over  netlib  [DGS7].  The  DIET  series  of  lest 
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Prob.# 

Prob  Name 
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19f 

BS 
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9f 

mm 
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5.0500000078262E+0 1 

24f 

mm 
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ng 

19 
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20 
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21 
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22 
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25f  j 

Table  4.3:  Test  problem  dimensions. 
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problems  was  created  from  an  example  in  [Chv83]  and  documented  in  (EntSS).  It 
was  originally  intended  for  debugging  purposes.  The  optimal  objective  values  for  the 
problems  as  reported  by  Gay  (excluding  the  DIET  series)  are  included  in  the  table. 


Problem 

Prob  Name 

#of 
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Table  4.4:  Test  problem  step  dimensions. 

Table  4.4  contains  the  staircase  dimensions  for  each  of  the  test  problems.  The 
default  number  of  subproblcms  created  is  listed  along  with  the  number  of  steps  in 
the  staircase.  The  minimum  and  maximum  dimensions  of  each  step  are  given,  along 
with  the  coupling  between  adjacent  steps  as  described  earlier  in  Figure  4.1. 

4.2.3  Test  Designs  and  Results 

The  physical  properties  power,  work  and  time  are  excellent  terms  to  describe  the 
performance  of  a  parallel  algorithm.  In  the  computing  environment,  the  unit  of 
power  is  a  CPU,  the  unit  of  work  is  a  CPU  second,  and  the  unit  of  time  a  second 
as  measured  with  a  wall  clock.  One  can  view  work,  or  CPU  time,  as  the  rent  paid 
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for  use  of  the  computer.  The  absolute  performance  measure,  however,  is  usually  the 
elapsed  time  needed  to  obtain  a  solution. 


MINOS 


DViCOMP/1  3  DECOMP/2  □  DECOMP/3  □  DECOMP/4 


] 


1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22 

Test  Problems 


Figure  *1.3:  Used  CPU  power  for  each  lest  problem. 


Power:  A  good  parallel  algorithm  has  two  properties.  First,  it  makes  efficient  use 
of  the  CPU  power.  When  two  CPUs  are  made  available,  both  arc  actually  used.  Most 
algorithms  do  not  achieve  perfect  efficiency.  Eighty  percent  is  often  considered  very 
good.  Figure  *1.3  displays  the  average  CPU  power  applied  to  solve  each  test  problem 
when  decomposed  into  the  default  number  of  subproblcms  given  in  Table  *1.4.  It  shows 
that  algorithm  has  little  trouble  utilizing  more  CPU  power,  especially  on  the  larger 
problems.  Of  course  there  is  a  limit.  Remember  that  at  most  N  —  1  processors  can 
be  kept  busy  by  the  algorithm,  where  N  is  the  number  of  nodes/subproblcms.  These 
experiments  were  run  with  at  most  four  CPUs  because  although  the  309Q/600E  has 
six,  it  cannot  effectively  offer  more  than  four  CPUs  in  a  multi-user  environment. 

There  are  no  decomposition  results  for  problems  SCAGR25,  STAIR  and  PILOT4 
because  dual-degeneracy  prevented  progress  and  a  primal  feasible  solution  was  not 
obtained. 


•1.2.  TESTING 


91 


Notice  also  that  the  used  CPU  power  for  problem  SC205  is  at  or  near  one  regardless 
of  />,  because  in  this  case  the  computer  spends  most  of  its  time  obtaining  a  primal 
feasible  solution.  At  the  time  of  writing,  the  Phase  1  algorithm  is  serial,  and  or.ly  one 
CPU  is  used  despite  the  availability  of  more.  In  fact  SC205  has  a  vacuous  objective 
row  for  all  but  the  first  step  in  the  staircase.  As  soon  as  a  feasible  point  is  found,  it  is 
the  optimal  one.  The  Phase  1  .algorithm  could  be  made  parallel  by  passing  infeasible 
primal  solutions  to  the  right,  but  this  has  not  yet  been  done. 


XtW  Dal* 


t  Form  Sub« 
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T 
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|  Print  Solution 

Figure  *1.4:  Strings  of  work. 
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Useful  Work  and  Idle  Time:  Figure  4.4  will  help  us  understand  how  the  data 
for  Figure  4.3  and  all  subsequent  figures  were  collected.  The  solid  lines  in  the  figure 
represent  useful  serial  work  doing  input  and  output.  These  times  arc  ignored.  The 
only  limes  reported  are  those  for  the  parallel  phase,  which  represents  a  majority  of 
the  work  done,  especially  for  large  problems. 

After  data  is  read  from  disk,  the  work  fans  out  to  p  independent  strings  of  work 
with  one  barrier  between  the  Form  Subs  and  Process  Subs  steps.  The  parallel  lines  in 
Figure  3.1  are  shaded  grey  with  intermittent  white  sections.  This  is  to  represent  useful 
and  idle  work  time.  Useful  work  is  spent  forming  and  solving  subproblcins,  whereas 
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idle  work  is  spent  counting.  In  the  multi-user  environment  on  the  IBM  309Q/G0GE, 
it  is  important  to  “waste  time"  counting  because  we  mus*t  know  how  much  idle  time 
is  really  being  used.  It  must  be  measured  somehow.  A  production  code  would  not 
do  this.  Idle  time  would  be  fdlcd  with  useful  work  from  other  user’s  jobs.  Counting 
idle  time  degrades  performance  at  the  expense  of  simulating  a  “generic"  computing 
environment. 

The  CPU  power  in  Figure  4.3  is  the  ratio  of  useful  work  (total  length  of  all  the 
grey  lines)  to  the  total  work  (total  length  of  the  grey  and  white  lines).  It  is  a  measure 
of  the  effective  CPU  power  applied  to  solving  the  problem. 

*\  second  aspect  of  collecting  CPU  times  needs  to  be  reported.  Each  parallel 
string  of  work  is  implemented  as  a  series  of  MVS  operating  system  tasks,  the  number 
of  which  is  not  predetermined.  Partly  because  of  this,  the  IBM  Parallel  Fortran 
Compiler  has  no  facility  for  collecting  individual  CPU  times.  An  assembler  language 
routine  for  collecting  MVS  task  times  was  used  instead. 

On  every  call  to  the  Parallel  Fortran  Library  the  MVS  task  may  change.  This 
has  been  likened  to  taking  a  sequence  of  taxis  to  travel  to  some  destination.  Street 
intersections  represent  library  calls.  You  never  know  when  you  will  change  taxis,  so 
to  ensure  payment,  you  make  installments  for  each  block  driven.  The  time  spent 
crossing  intersections  is  not  recorded.  Likewise,  the  MVS  task  time  is  recorded  be¬ 
tween  subroutine  calls,  but  the  time  spent  in  the  subroutine  library  is  not  recorded, 
and  causes  a  10%  to  15%  shortfall  in  the  total  CPU  lime  reported  for  the  largest 
test  problem  SCSDS;  see  Table  4.5.  The  unaccounted  time  falls  into  the  idle- work 
category  because  the  library  routines  are  called  only  when  a  processor  is  trying  to  find 
something  to  do  besides  count.  For  this  reason,  the  clocks  on  each  Fortran  Processor 
are  used  to  measure  only  useful  work,  while  a  job  clock  measures  the  total  length  of 
all  the  parallel  strings.  The  difference  between  the  sum  of  the  processor  clocks  and 
the  job  clock  is  attributed  to  idle  work. 
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Problem 

Real 

My 

Percent 

Name 

CPU 

CPU 

Error 

SCSD8/2 

15.32 

m 

SCSD8/3 

12.74 

11.569 

9% 

SCSD8/4 

12.78 

1 1.587 

9% 

SC!St>8/5  “ 

TOST 

9.693 

m 

SCSD8/6 

12.13 

10.908 

10% 

SCSD8/7 

10.72 

9.471 

12% 

3CST557B 

TC3S 

“  12.312 

9% 

SCSD8# 

9.73 

8.488 

13% 

SCSD8/I0 

10.97 

9.708 

12% 

scsiwrr- 

TO? 

r  10.288 

f— "  11% 

SCSD8/12 

11.42 

10.072 

12% 

SCSD8/13 

10.89 

9.575 

12% 

SCSb8/14 

RT7T 

9.3*77 

12% 

SCSD8/39 

31.55 

28.967 

8% 

Table  4.5:  Shortfalls  in  measuring  work. 

Another  set  of  runs  were  executed  with  and  without  the  processor  clocks,  pro¬ 
viding  a  very  sensible  illustration  of  the  Heisenberg  Principle.  You  cannot  mcastttt 
performance  without  affecting  it. 

Job  CPU  times  arc  reported  in  Table  4.5.  SCSDS  was  solved  multiple  times  for 
n  =  11  and  p  =  1,4  both  with  and  without  the  processor  clocks.  Two  percent  faster 
times  were  obtained  without  the  clocks  when  using  only  one  processor,  and  four 
percent  slotvcr  times  were  obtained  without  the  clocks  when  using  four  processors. 
We  can  expect  a  similar  effect  for  other  test  problems. 

Two  speculations  have  been  offered  to  support  the  variations  caused  by  ailing 
the  clocks.  The  first  is  that  part  of  the  excess  time  is  getting  lost  by  the  operating 
system  during  the  system  clock  calls  (Wcl89).  The  other  is  that  the  operating  system 
is  using  the  clock  calls  as  opportunities  to  interrupt  the  processor  [ForS9].  A  system 
interrupt  would  appear  beneficial  in  that  it  would  most  likely  be  interrupting  idle 
work  time. 

Finally,  we  can  see  from  Figure  4.5  that  the  total  work  required  to  solve  a  problem 
is  not  deterministic.  The  same  program  configuration  was  run  several  times  with 
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Figure  4.5:  The  Heisenberg  Principle. 

different  results.  lienee,  the  reported  times  an  the  average  of  up  to  time  successive 
runs. 

Work:  The  second  property  of  a  good  parallel  algorithm  is  that  the  total  work  does 
not  increase  as  the  number  of  processors  increases.  Figure  4.6  gives  the  total  parallel 
work  done  on  each  test  problem  using  both  MINOS  and  DECOMP  (p  =  1,2, 3, 4). 
Notice  that  the  work  actually  decreases  from  p  =  1  to  p  =  2  for  problem  13.  This  is 
possible  because  there  is  no  control  over  the  path  taken  to  the  solution,  and  different 
paths  can  be  taken  for  different  numbers  of  processors.  The  conclusion  to  be  drawn 
from  these  results  is  that  the  total  work  does  not  substantially  increase  as  the  number 
of  processors  increases. 

Time:  Together  with  the  effective  use  of  CPU  power,  we  obtain  respectable  reduc¬ 
tions  in  elapsed  times  to  solve  the  test  problems,  as  shown  in  Figure  4.7.  We  have 
used  a  log  scale  for  this  figure  because  of  the  great  disparity  in  time  required  to  solve 
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Test  Problems 


Figure  <1.6:  Work  required  to  solve  each  test  problem. 
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Figure  4.7:  Time  required  to  solve  each  test  problem. 
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the  small  and  the  large  problems.  A  more  effective  presentation  is  made  by  normal¬ 
izing  the  scale  for  each  tes  problem.  In  Figure  4.$,  the  times  of  each  individual  test 
problem  have  been  normalized  by  the  time  used  by  MINOS.  The  result  is  called  the 
“Speedup  over  MINOS."  In  this  ease,  a  value  of  2  would  mean  that  the  decomposition 
algorithm  found  the  solution  twice  as  fast  as  MINOS.  The  figure  shows  that  parallel 
decomposition  is  consistently  belter  than  the  simplex  method  on  the  larger  problems. 


■  MINOS 


■  DECOMP/l  ■  DECOMP/2  □  DECOMP/3  □  DECOMP/4 


i 


Figure  4.S:  Speedup  over  MINOS  for  each  test  problem. 


Speedups:  There  arc  two  benefits  derived  from  parallel  decomposition  that  give 
such  speedups.  The  first  is  that  for  the  larger  problems,  decomposition  alone  (p  =  1) 
has  offered  a  speedup.  For  instance,  problem  21  (SCTAP3)  is  solved  10.5  times  faster 
just  because  of  a  change  of  algorithm. 

The  second  benefit,  naturally,  is  derived  from  using  more  CPU  power.  Figure  4.9 
is  a  display  of  elapsed  times  that  were  normalized  by  the  time  used  by  DECOMP/l 
for  each  test  problem.  With  this  perspective,  we  can  effectively  judge  the  benefits 
of  adding  processors.  Notice  that  for  problem  21,  the  computation  was  sped  up 
by  an  additional  factor  of  1.6  over  DECOMP/l  because  of  the  addition  of  three 
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Test  Problems 

Figure  4.9:  Speedup  over  DECOMP/1  for  each  test  problem. 

CPUs  of  power  for  a  total  for  four  CPUs.  The  overall  benefit  provided  by  parallel 
decomposition  with  four  processors  was  a  factor  of  16.S  speedup  as  seen  in  Figure  4.S. 

The  Number  of  Subproblems:  This  experiment  demonstrates  the  increase  in 
overhead  of  decomposition  as  the  number  of  subproblems  increases.  The  largest  test 
problem,  SCSDS,  has  39  steps.  If  subproblcms  arc  limited  to  a  discrete  number  of 
steps,  the  number  of  subproblcms  is  limited  to  the  set  {2,3, . . . ,  39}.  Note  also  that 
there  arc  3S  ways  to  partition  the  two-subproblem  ease.  The  number  of  steps  per 
subproblcm  was  chosen  to  be  nearly  the  same  in  each  ease. 

Even  though  communication  times  are  negligible,  because  this  is  a  shared  memory 
computer,  there  is  still  a  significant  overhead  involved  in  making  the  proper  response 
to  all  received  messages.  On  thcothei  hand,  there  is  an  uncertain  benefit  from  solving 
a  staircase  with  decomposition.  These  two  effects  combine  in  this  experiment.  Fig¬ 
ure  4.10  shows  the  total  work  used  to  solve  SCSDS  when  the  number  of  subproblems 
N,  varies  between  2  and  38  (even  numbers  only),  and  the  number  of  processors  />, 
varies  from  one  to  four. 
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■  DECOMP/1  ■  DECOMP/2  □  DECOMP/3  □  DECOMP/4 
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Figure  4.10:  Work  to  solve  SCSDS  using  increasingly  finer  partitions. 

For  small  iV,  the  total  work  remains  relatively  constant — around  ten  CPU  seconds. 
The  run  requiring  the  least  amount  of  work  is  for  N  -  14  and  p  =  1  with  9.4  CPU 
seconds.  After  the  average  size  of  a  subproblcm  begins  to  fall  below  three  steps,  N  « 
13,  the  total  work  increases.  This  is  consistent  with  our  general  observations  with  one 
CPU,  that  a  staircase  problem  with  less  than  2000  nonzeros  is  not  worth  decomposing, 
as  the  decomposition  overhead  begins  to  outweigh  its  benefits.  However,  with  more 
processors  the  results  arc  different. 

The  run  solving  the  fastest  overall,  as  seen  in  Figure  4.11,  is  N  =  10,  )>  —  4  with 
3.4  elapsed  seconds.  For  p  =  2,  the  minimum  is  5.8  elapsed  seconds  with  N  =  10  and 
12,  and  for  p  —  3  the  minimum  is  3.7  elapsed  seconds  occurring  at  both  N  =  12. 

The  best  speedup  for  a  fixed  number  of  subproblcms  is  at  N  =  32,  with  a  factor 
of  4.7;  see  Figure  4.12.  However,  the  best  serial  time  versus  the  best  parallel  time  is 
S.7/3.4  =  2.6,  but  this  was  obtained  only  after  an  exhausting  search. 

The  trend  is  consistent  that  more  processors  makes  decomposition  faster.  The 
effect  of  the  number  of  processors  on  solution  time  is  likely  to  be  a  function  of  the 
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Figure  ‘1.11:  Time  versus  the  number  of  subproblcms  for  SCSDS. 


Number  of  Subproblcms 

Figure  4.12:  Speedup  versus  the  number  of  subproblems  for  SCSDS. 
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solver  more  than  anything  else.  Small  LP  subproblcms  are  best  solved  with  a  vector¬ 
ized  tableau  method.  Specialized  subproblcms  like  network  flows  are  best  solved  by 
combinatoric  algorithms.  MINOS,  the  solver  used  here,  performs  best  on  medium¬ 
sized  staircase  problems  (relative  to  decomposition).  Serial  solvers  should  be  chosen 
according  to  the  size  and  nature  of  the  subproblcms. 

The  Number  of  Processors:  This  is  a  study  on  the  effective  use  of  processors. 
At  a  time  when  the  computer  was  lightly  loaded,  SCSDS  was  solved  using  from  one  to 
seven  Fortran  Processors.  Ijj  IBM  Parallel  Fortran,  a  Fortran  Processor  is  a  series  of 
MVS  Operating  System  tasks,  so  more  than  six  may  be  requested  for  a  six-processor 
machine.  Seven  is  the  limit  based  on  memory  restrictions. 


Processors 

Figure  *1.13:  Work  and  time  versus  processors  for  SCSDS. 

Figure  <1.13  is  a  classic  speedup  diagram  for  this  problem.  Here,  speedup  is  calcu¬ 
lated  relative  to  the  solution  time  for  decomposition  with  one  processor.  Naturally, 
the  point  (1,1)  is  represented.  The  diagonal  line  shows  the  ideal. 

The  next  figure,  4.M,  graphs  the  dichotomy  of  Work  versus  Time  for  varying 
numbers  of  processors.  Sharp  dips  in  the  amount  of  work,  as  for  the  six-processor 
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ease,  can  only  be  attributed  to  good  fortune.  Experiments  on  a  dedicated  machine 
could  settle  many  uncertainties  as  to  the  true  benefactors  of  parallel  decomposition. 
At  this  writing,  we  can  say  only  that  they  exist. 

4.2.4  Performance  Extrapolations 

How  will  parallel  decomposition  perform  on  larger  problems? 

In  the  next  two  experiments,  we  reexamine  the  results  for  a  constant  number  of 
subproblcms  by  grouping  the  test  problems  by  family.  We  consider,  as  the  problems 
in  a  family  get  larger,  how  parallel  decomposition  should  perform  on  even  larger 
problems. 

Extending  the  Staircase:  This  is  the  first  of  two  discussions  regarding  extrap¬ 
olation  of  the  results  beyond  the  test  suite.  One  way  to  make  a  staircase  problem 
larger  is  to  add  more  steps.  This  means  that  either  the  planning  horizon  is  length¬ 
ened  or  it  is  represented  in  finer  detail.  There  are  three  such  series  in  our  test  suite: 
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Test  Problems 

Figure  *1.15:  Speedup  over  DECOMP/1  for  extending  staircases. 

DIET,  GROW,  and  SCSD.  The  speedup  results  from  Figure  *1.9  arc  reproduced  in 
Figure  *1.15  for  the  latter  two  only,  since  the  LPs  of  the  DIET  series  arc  too  small.  We 
see  that  ns  the  length  of  the  staircase  extends,  the  parallel  algorithm’s  performance 
is  not  degraded. 

Model  Complexity:  Another  method  of  increasing  the  size  of  staircase  problems 
is  to  add  more  complexity  to  the  model, i.c.,  to  disaggregate.  For  instance,  “dairy 
products”  becomes  milk,  cheese,  yogurt  and  ice  cream.  Adding  complexity  allows  a 
model  to  give  a  more  detailed  solution,  and  the  modeler  to  address  interactions  more 
specifically.  /.  summer  rise  in  the  price  of  the  aggregate  “dairy  products”  may  only 
be  a  reflection  of  more  demand  for  ice  cream! 

The  SCTAP  series  of  problems  keep  the  same  number  of  steps,  but  increase  the 
number  of  rows,  columns  and  nonzeros  per  step.  Figure  4.16  is  a  reproduction  of 
the  elapsed  times  for  this  series.  It  shows  that  the  simplex  method  has  increasing 
difficulty  with  this  problem,  while  the  performance  of  the  parallel  decomposition 
algorithm  does  not  degrade. 
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4.3 

We  have  taken  a  long  tour  through  the  space  of  all  communication  networks,  but 
the  experience  has  created  surgeons  from  interns.  What  we  sPcc  apart  is  more  than 
just  a  linear  program.  It  is  a  modeler’s  presentation  of  some  small  part  of  the  world. 
The  pieces  and  their  interactions  can  now  be  observed  from  a  new  perspective:  as  a 
network  of  communicating  entities.  The  communication  is  structured  .and  directed 
toward  obtaining  a  consensus  via  local  agreements.  How  can  communication  patterns 
be  studied?  Arc  their  optimal  configurations  based  on  a  modeler’s  knowledge  of  the 
natural  configural  jn?  What  arc  the  strong  and  the  weak  links?  These  are  probing 
questions  to  answer  with  further  investigation. 

The  main  conclusion  to  make  about  the  computational  results  is  that  if  serial  de¬ 
composition  does  well  on  a  given  problem  then  parallel  decomposition  does  also.  This 
is  not  surprising,  but  what  we  have  also  seen  is  that  even  when  serial  decomposition 
is  slow,  parallel  decomposition  can  still  be  made  to  solve  proolems  faster  than  the 


o 

V 


r 


SCTAP1  SCTAP2  5CTAP3 

Test  Problems 

Figure  4.16:  Speedup  over  DECOMP/1  for  more  complex  staircases. 


Conclusions 
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simplex  method  hv  adding  more  processors.  In  general,  adding  more  processors  will 
help,  but  there  is  a  limit. 

An  important  accomplishment  is  that  by  characterizing  the  oracle  and  the  relaxed 
oracle,  we  define  an  interface  that  allows  any  convenient  subproblem  solver  to  be  used. 
The  essential  part  of  decomposition  is  not  how  a  subproblem  is  solved,  but  the  form 
of  its  solution. 

In  addition,  we  can  now  see  that  the  subproblems  need  not  be  linear  programs. 
Convex  functions  and  regions  can  be  approximated  with  piece-wise  linear  functions 
and  extreme-point  representations. 

Finally,  no  practical  implementation  of  a  theoretical  algorithm  is  perfect.  Ours 
needs  work  to  make  it  more  robust  and  handle  ever  larger  problems.  Let  it  be  our 
hope  that  the  techniques  and  ideas  discussed  here  will  find  practical  use. 


Appendix  A 

Example  Subproblem 
Formulations 


Wc  now  present  examples  of  decomposition  applied  to  three  structured  linear  pro¬ 
grams,  and  one  that  is  unstructured.  These  are  intended  to  offer  a  better  understand¬ 
ing  of  the  previous  sections,  and  serve  as  recommended  procedures  for  applying  the 
concepts  of  this  thesis  to  practical  examples. 

Block  Diagonal:  This  is  the  simplest  example  for  decomposition.  The  problem 
consists  of  two  completely  independent  linear  programs  contained  in  one.  By 
investigating  the  formulations  of  the  subproblems,  wc  find  that  decomposition 
can  impose  dependencies  not  regularly  recognized  in  practice. 

Staircase:  Here  wc  take  a  staircase  pattern  and  slice  it  vertically  just  as  in  the  dia¬ 
gram  in  the  introduction  of  this  thesis.  In  the  final  chapter,  wc  apply  the  parallel 
oracle  to  the  resulting  subproblcins  for  a  variety  of  real-world  test  problems. 

Two-Stage  Stochastic:  This  is  our  first  example  of  cross  nesting.  Again,  the  dia¬ 
gram  in  the  introduction  contains  the  anatomy  and  the  sequence  of  slices  used. 

Dense:  This  nondescript  structure  is  used  to  demonstrate  a  procedure  by  which  the 
anatomic  structure  is  broken  down  to  the  level  of  a  single  coefficient. 
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These  cxamle  don  not  nuke  use  of  htc  snbproblcm  interface  theorem,  so  for  instance 
a  vertical  arc  index  set  is  defined  on  the  intersection  of  the  row  index  sets  of  the 
joined  nodes. 


A.l  Block  Diagonal  Example 

Wc  begin  our  series  of  examples  with  the  simplest  block  diagonal  case,  where  the 
constraints  of  two  subproblcms  lie  in  independent  spaces.  The  subproblems  arc  com¬ 
pletely  independent,  except  that  they  arc  coupled  via  the  objective,  indexed  by  a.  In 
this  example  there  will  be  information  passed  between  the  subproblcms,  but  only  of 
the  most  trivial  nature. 

min  clx’  +  c3x3  =  z 

*‘>o 

dr':  A“xl  >b' 

x3 :  A3V  >  b\ 

As  noted  earlier,  the  names  of  the  primal  and  dual  variables  of  the  Block  Diagonal 
Problem  are  used  as  indices  for  the  rows  and  columns  of  the  coefficient  matrix  A. 
The  block  diagonal  LP  (A.l)  has  superscripts  in  order  to  differentiate  the  problem 
data  and  variables  from  those  of  the  subproblems,  which  will  have  subscripts. 

Block  Diagonal  Problem  Description: 

TI  —  -kv  Ux3,  C  —  xx  Ux3, 

/i=(/t0"  *=(£)eS*  cJ=(-c' 

Block  Diagonal  Communication  Network  Description: 

(ATM)  =  ({1,2},  {(12),  (21)}) 

Hi  =  TLi  = 

Cj  =  x1  U  x3,  Ci  —  x1  U  x3, 

Tu  =  down,  T-ix  —  up. 
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Block  Diagonal  Incidence  Graph  Description: 

*  =  ({*.** i ii s1! s3}.  {(***).  (<«3).  (*‘z,)»  (*3*)»  (*3*3)})- 

Recall  that  cr  and  $  index  the  objective  and  right-hand  side,  respectively. 


Block  Diagonal  Arc  Index  Sets:  There  arc  no  horizontal  arcs,  so  Cn  =  C2\  =  x2. 


Block  Diagonal  Partition  Graphs:  There  is  only  one  partition  graph,  so  all 
added  variables  will  be  indexed  by  their  associated  arcs,  p  —  7^  =  71, 

C„  =  C. 

Block  Diagonal  Subproblem  (1  G  //):  Node  one  is  topmost  and  leftmost. 
Original  Variables:  xx  €  and  xi  G  SR0’. 

Original  Data:  Ai  =  ( /lu  0 ),  &i  =  ( 6l ),  and  cf  =--(cl  c2 ). 


Incoming  Arcs:  There  is  one  incoming  arc  to  node  one,  (21),  and  it  has  type 
—  up.  It  determines  the  added  variables  and  data. 

Added  Variables:  In  G  02:  G  &,  and  foi  G 

Added  Data:  1  is  indexed  by  (02i>Si),  <712  is  indexed  by  (^ax* ^21)*  Xn  is  indexed 
by  (^21,  /21),  and  -Ix2  is  indexed  by  (V>2i,zi). 


Non-negativity:  I21  >  0  and  xx  is  free. 


Constraint  Types:  rrj  is  a  >,  ^21  is  an  =,  and  02X  is  an  =. 
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Formulation  (1  €Af): 


/jl  Xi  Sj 


*1 

Ay 

> 

by 

V'n 

X\7 

—I\7 

= 

0 

02l 

9vi 

= 

1 

>0 

free 

<Tl 

L°J 

1  «r 

Notice  in  the  formulation  for  node  1  that  for  the  dual  variables,  $ 2,  =  cl  because  of  a 
dual  identity  with  the  objective  row.  This  means  that  the  information  being  passed 
to  node  2  is  constant  and  equals  the  values  of  the  original  objective  for  the  columns 

i1. 

Block  Diagonal  Subproblem  (2  6  Af)‘  Node  two  is  leftmost  and  not  topmost. 
Original  Variables:  jt2  6  SR*3  and  x2  €  S^3. 

Original  Data:  /l2  =  (0  /l22),  6j  =  (62),  cj=  (0  0). 

Incoming  Arcs:  There  is  one  incoming  arc  (12),  of  type  7i2  =  down. 

Added  Variables:  uq2  6  SR  and  v12  G  !R. 

Added  Data:  t^21  is  indexed  by  (ui2,  x2),  02 1  is  indexed  by  (uj2,  s),  S2X  is  indexed 
by  (ui2,u;i2),  and  another  is  indexed  by  (rr2,tOi2). 

Non-negativity:  x2  >  0,  and  twi2  is  free. 


Constraint  Types:  u12  is  >  and  jt2  is  >. 
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Forr'’  ’.ation  (2  6  Af): 


x 7 

u>13 

$2 

Ujj 

Sji 

> 

-02 1 

/lj 

> 

*2 

IV 

o 

jrtt 

<r2 

0 

The  node  2  sufcproblcm  will  be  solved  only  once  based  on  the  constant  information 
passed  to  it  from  node  1.  The  returned  primal  solution,  when  incorporated  into  A'u 
and  ijii  in  the  node-1  subproblcm,  will  allow  it  to  be  solved  in  only  one  iteration.  The 
overall  optimum  is  then  achieved.  The  overall  solution  is  »*i)* 


A.2  Staircase  Example 

This  example  differs  from  the  previous  in  that  it  uses  Benders  Decomposition  and 
there  arc  now  coupling  constraints  between  the  partitioned  columns.  As  a  result,  the 
information  passed  over  the  communication  network  will  not  be  so  trivial. 

min  c1!1  +  c2x2  =  z 

x‘>0 

sX.t"°  x1  :  Anxl  >  b 1  (A<2) 

x2  :  A21!1  +  A22x2  >  b 2. 

As  in  the  previous  example,  the  names  of  the  primal  and  dual  variables  of  the  Staircase 
Problem  arc  used  as  indices  for  the  rows  and  columns  of  the  matrix  A. 

Staircase  Problem  Description: 

%  =  x1  U  x2,  C  =  x1  U  x2, 


A={AA"  a”)6*™'  b={l)e**'  cT=(c'  Os*"*- 


no 
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Staircase  Communication  Network  Description: 

(JV..4)  =  ({1.2},  {(12),(21)}) 
=  xl  U  x2,  —  x1  U  x2, 

Ci=xl,  C2  =  x2, 

T\i  =  right,  Tx\  —  left. 


Staircase  Incidence  Graph  Description: 


h-({<Tt  x\  x2,  s,  X1 ,  X2},  {(ox1),  (<rx2),  (x1*),  (xlxl),  (x2s),  (xV),  (x2x2)}) 


Staircase  Arc  Index  Sets:  There  are  no  column  coupling  sets  since  there  arc  no 
vertical  arcs,  ftu  =  ft2l  =  x2. 


Staircase  Partition  Graphs:  There  is  only  one  partition  graph,  so  all  added 
variables  will  be  indexed  by  their  associated  arcs,  p  =  ftp  =  ft,  and  Cp  =  C. 

Staircase  Subproblem  (1  G  M):  Node  one  is  topmost  and  leftmost. 

Original  Variables:  xj  G  &**,  and  xj  in$fl . 

Original  Data:  A\  =  ^a|  j ^  ^ ,  and  cj=  ( c1 ) . 

Incoming  Arcs:  There  is  one  incoming  arc  to  node  one  and  its  type  is  7ji  =  left. 

Added  Variables:  A2J  G  %lK,t  t2i  G  3?,  and  y2i  G  SR71’1. 

Added  Data:  1  is  indexed  by  (<ri,  t2i),  7i2  is  indexed  by  (A2i,  f2i),  fli2  is  indexed 
by  (A21,j/2i),  and  -ln  is  indexed  by  (x,,^). 

Non-negativity:  Xi  >  0,  and  both  y2i  and  t2j  are  free. 


Constraint  Types:  A2J  is  >  and  xi  is  =. 
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Formulation  (1  €  ff): 


X\ 

Vix 

*21 

Sr 

Xu 

n,2 

712 

> 

0 

*1 

Ax 

—hi 

= 

>0 

free 

frtt 

<t\ 

CT  1 
ci  1 

0 

1  1 

Staircase  Subproblem  (2  €  Af)'.  Node  two  is  topmost  and  not  leftmost. 

Original  Variables:  *2  €  and  xj  6  5^*. 

Original  Data:  A 2  = 

Incoming  Arcs:  There  »  one  arc  (12)  incident  to  node  one  and  its  type  is 
Tyi  -  right. 

Added  Variables:  un  €  SR,  and  €  $. 

« 

Added  Data:  y2\  is  indexed  by  (x2,uu),  — F21  is  indexed  by  (o^uu),  d2y  is 
indexed  by  (wi2,Ui2),  and  another  d2i  is  indexed  by  (wi2,S2)- 

Non-negativity:  >  0  and  X2  >  0. 

Constraint  Types:  r2  is  >,  un  is  =  1. 


U12 

X2 

S2 

*2 

j/21 

A2 

> 

0 

a>i2 

d2\ 

= 

d2\ 

Formulation  (2  6  Af): 
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A.3  Two-Stage  Stochastic  Example 

Consider  b7  in  (A.3)  to  be  &  discrete  random  variable  having  realizations  b]  and 
probabilities  P,  for  all  a  6  (1,...S).  We  will  derive  the  subproblems  for  this  multi¬ 
point  distribution. 

min  c'z1  +  c2x2  =  s 

*>>o 

l.L  xl  :  A'7x7  >  b\  (A<3) 

x2:  A7lxl+A77x7>b7. 

Stochastic  Problem  Description:  For  a  multi-point  distribution,  the  indices  xj 
and  x2  will  be  repeated  for  each  instance  of  a.  However,  it  is  a  modeling  issue  as  to 
whether  x1  is  repeated.  The  meaning  of  these  constraints  becomes  ambiguous  when 
random  data  arc  introduced.  If  we  limit  ourselves  to  linear  formulations,  we  still  have 
the  choice  to  model  them  either  as  a  single  expected- value  constraint,  Ej{A7lx 2}  >  61, 
or  as  multiple  absolute  constraints  A7tx7  >  6l,  for  all  a. 

Given  the  a  priori  assumption  to  keep  the  objective  linear  by  using  expected 
values,  the  second  ease  corresponds  to  Stochastic  Linear  Recourse.  To  keep  things 
simple  we  choose  the  cxpcctcd-valuc  constraint. 

7£  =  x1Ux2,  C  =  x1  UxJ, 

/  0  A\7  •••  Als7\ 

y<21  A22  /  l\  \ 

/t=  .  ..  s»*xC,  6=L]jeSR,  cT=(c*  cj)6*lxc, 

v/l21  AnJ 

A12  =  P,A12,  c2  =  PjC2. 

Stochastic  Communication  Network  Description:  This  example  crosses  D-W 
and  Benders  Decomposition.  First  D-W  is  applied,  then  Benders  is  applied  to  the 
bottom  problem.  This  obviates  the  need  for  the  special  Cross-Splitting  described 
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previously.  The  communication  network  is  defined  for  all  j  6  {1,.. 

(JM)  =  ({1,2,3},{(12),(2I),(U),(M),(2S),(32)}), 

R,=x',  R,  =  »JVi6{1,...,5},  H,  =  rl 
Ci  =  z1  UzjVz  6  {1,...,S},  Cj  =  z1,  Ca  =  x], 

Tn  =  down,  Tjj  =  up,  7J,  =  down,  Tt\  —  up,  Ti,  —  right,  Tti  =  left. 

In  addition  to  nodes  one  and  two,  this  network  has  one  node  for  each  distribution 
point.  Each  communicates  to  node  1  via  up  and  down  arcs,  and  with  node  2  via  left 
and  right  arcs. 


Stochastic  Incidence  Graph  Description:  Note  that  the  nodes  for  x]  and  z ] 
arc  repeated  for  cadi  instance  of  s. 


h  =  ({<rt  tt1,  *?,s, z\  z?},  {(<rx*),  [ax]),  (it's),  (x'x]),  (x?a),  (x.V),  (jr?z2)}). 


Stochastic  Arc  Index  Sets:  Nodes  1  and  2  communicate  only  objective  informa¬ 
tion  as  we  saw  previously  in  the  Block  Diagonal  Example. 

ft2j  =  ft*2  =  *?>  C12  =  C21  =  z\  C\3=C,  i=z*. 


Stochastic  Partition  Graphs:  There  are  two  partition  graphs  j>j  and  p2.  They 
are  ordered  so  that  pi  is  before  (i.c.,  the  parent  of)  p 2. 

Pi  =  ({1,2,3},  {(12),  (21),  (Is),  (si)}),  Tip,  —  11,  CP,=C , 

p1  =  ({2,S},{(21),(i2))),  7^=^,  C„=C. 

The  added  variables  associated  with  arcs  (21)  and  (si),  whidi  link  the  child  partition 
nodes  {2,s}  to  the  parent  partition  node  {1},  will  be  indexed  by  the  child  partition 
Pi- 


Stochastic  Subproblem  (1  G  Af):  Node  one  is  topmost  and  leftmost. 
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Original  Variables:  6  and  ij  6  5iCl. 

Original  Data:  /tj  —  ( 0  /i]2 ) ,  &i  =  ( 61 ) ,  and  cf=(cl  c]). 

Incoming  Arcs:  Node  one  has  S  1  incoming  arcs  (21)  and  (si)  and  their 
types  arc  7ji  =up  and  Ttl  =up  for  all  s  €  {1,. . . ,5}.  They  all  connect  the  partition 
graphs  pi  and  pa. 

Added  Variables:  /w  6  &Kl,  0n  €  SR,  fax  6  and  V>*i  €  3^“.  The  sources 
of  the  incoming  arcs  arc  in  pj.  Therefore,  /  and  0  arc  subscripted  by  pj. 

Added  Data:  1  is  indexed  by  (0^,a i),  gn  is  indexed  by  (0W,  /w),  Xu  is  indexed 
by  (faxJn),  A'i,  is  indexed  by  (fax ,/w),  --/,2  is  indexed  by  (V»ai,xi),  and  -Iu  is 
indexed  by  (V>ji,xi)  for  all  s  €  {1,...,$}.  Note  that  g  is  subscripted  by  pj. 

Non-negativity:  /21  >  0,  *',1  >  0  and  xx  is  free. 


Constraint  Types:  r\  is  >,  fax  is  =,  ip,x  is  =,  and  0n  is  =. 


Formulation  (1  €  Af)i  This  is  a  D-W  Master  problem  to  the  implicit  subprob¬ 
lem  defined  on  p2.  It  incorporates  new  columns  in  a  synchronous  manner  based  on  a 
P2-fcasible  point  for  a  given  value  of  (fau^lx)' 


fax 

fax 


v\ 


In 

ii 

Ai 

Xxi 

—Iu 

XXs 

-hs 

-r 

9n 

>0 

free 

0 

M 

S I 


bx 


Vs  6  {1, .  •  •  >5}j 
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Stochastic  Subproblem  (2  €  V):  Node  2  is  leftmost  and  not  topmost. 

Original  Variables:  x2  €  and  x2  € 

Original  Data:  /12  —  ( A17 ) ,  6a  =  ( b2 ) ,  and  c,  =  ( 0 ) . 

Incoming  Arcs:  There  are  5  +  1  arcs  entering  node  two  and  their  types  arc 
T\2  =  down  and  T#  =  left.  They  all  lie  within  pa* 

Added  Variables:  u>ia  6  3?,  via  G  Aj2  €  tt a  €  3fc,  and  yf2  G  &**1. 

Added  Data:  V>2i  is  indexed  by  (via,z2),  —O21  is  indexed  by  (ui2,s),  6n  is 
indexed  by  (via,u>i2),  another  621  is  indexed  by  (^a.u/u),  1  >s  indexed  by  (<r2,  t*a)»  in 
is  indexed  by  fta«  is  indexed  by  (Aj2, y*a)»  and  -J2,  is  indexed  by  (x2ly<2)* 

Non-negativity:  x2  >  0,  and  yf2,  tni2l  and  tl2  arc  free. 

'Constraint  Types:  Aj2  is  >,  Via  is  >,  and  x2  is  =. 

Formulation  (2  €  Af):  This  is  a  Benders  Master  program  to  the  subproblems 
defined  over  s.  Each  subproblem  adds  constraints  independently  of  the  others. 


x2 

V»2 

tuia 

t«2 

S2 

Aj2 

n2. 

in 

> 

0 

uja 

'Ph 

621 

> 

—On 

tt2 

/12 

-h. 

= 

62 

>0 

free 

free 

free 

i  0 

0 

621 

1 

116 


APPENDIX  A.  EXAMPLE  SUBPROBLEM  FORMULATIONS 


% 


Stochastic  Subproblem  (s  6  Af)'.  Node  s  is  neither  topmost  nor  leftmost. 

Original  Variables:  x,  G  3?*'  and  xt  G  3^*. 

Original  Data:  At  =  ( A77 ),  bt  =  ( 0 ),  and  cj  =  ( 0 ). 

Incoming  Arcs:  Node  s  has  two  entering  arcs  (Is)  and  (2s)  with  types  T\t  = 
down,  Tu  —  right. 

Added  Variables:  ti>u  €  3?,  vu  G  3i,  wj,  G  3?,  and  ti3,  G  3?. 

Added  Data:  if>t i  is  indexed  by  (ui,,!,),  -9tl  is  indexed  by  (vj„s4),  S,i  is 
indexed  by  (uit,tuu),  6, j  is  indexed  by  (<r,,tui,),  yt2  is  indexed  by  (x„ti3j),  -tl2 
is  indexed  by  (<rt}u2t)t  dt2  is  indexed  by  (wu,u3,),  and  another  d,2  is  indexed  by 
(wi„s,). 


Non-negativity:  u2,  >  0,  x,  >  0,  and  wlt  is  free. 


Constraint  Types:  ul4  is  >,  xt  is  >,  and  o>2<  is  =. 


Formulation  (s  G  A'): 


u2f 

x. 

wu 

i. 

Uj, 

*7l 

hi 

> 

-o.  X 

y.7 

A, 

> 

0 

U)U 

d,2 

= 

dt  2 

>0 

IV 

o 

free 

V, 

—tt2  ] 

0 

hi 

A.4.  DENSE  EXAMPLE 
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A.4  Dense  Example 


This  final  example  demonstrates  cross  splitting  on  a  dense  matrix.  The  communi¬ 
cation  network  has  five  nodes,  one  of  which  has  an  empty  row  index  set.  This  is 
a  trick  by  which  we  can  apply  the  cross  splitting  technique  to  the  extent  that  each 
subproblcm  is  based  on  a  single  coefficient  of  the  constraint  matrix.  Our  starting 
formulation  is 

min  c'x1  +  c2x 2  =  z 

x*>0 

sZ.t"°  :  A"xl  +  A'7x7>b\  (A>4) 

jr2 :  /l21!1  +  /l22x2  >  62. 

Dense  Problem  Description: 

7v  =  U  x2,  C  =  x1  U  x2, 

£)«*“■  *>€*-. 

Dense  Communication  Network  Description: 

M  =  ({1,2,3, 4,5}, 

A  =  ((12),  (21),  (13),  (31),  (14),  (dl),  (15),  (51),  (23),  (32),  (45),  (54)}) 

7^i  =  0,  Hi  =7T1,  7l3  =7T1,  TIa  ~  7f2,  7S5  =  7T2, 

C,  =  x1Ux2,  C2  =  x1,  C3  =  x2,  C|  a  a*,  C5  =  x2, 

Tin  =  Tl3  =  Tu  =  T\i  ~  down,  =  %\  =  T4\  =  T$i  =  up, 

^23  =  Tk  as  right,  T2i  =  TSi  =  left. 


Dense  Incidence  Graph  Description: 


h  =  ({cr,7rI,7r2,s,xl,x2},{(<rx1),(ax2),(7r1s),(7r1xl),(ff1x*),(x2s),(7r2x1),(7rV)}). 


Dense  Arc  Index  Sets: 

7^23  =  7^32  =  X1 ,  7^5  =  7^54  =  X2, 


Cj2  —  C21  —  X1,  C13  —  C31  —  X2,  C\4  —  C4l  —  X1,  C\  5  =  C51  =  X2. 
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Dense  Partition  Graphs:  There  arc  three  partition  graphs  plt  pa,  and  p3  in  the 
communication  network  of  this  example.  As  determined  by  the  ordering  of  the  nodes, 
Pi  is  the  parent  to  both  pa  and  pa: 

p,  =  ({1, 2, 3, 4, 5),  {(12),  (21),  (13),  (31),  (14),  (41),  (15),  (51)}), 

/Rpi  —  7£,  CPI  =  C, 

P2  —  ({2,3},  {(23),  (32)}),  7^  =  t\  C„=C, 

P3  =  ({4,5},  {(45),  (54)}),  7^=*3,  C„=C. 

Therefore,  the  added  variables  associated  with  arcs  (21),  (31),  (41),  and  (51),  which 
link  the  child  partition  nodes  {2, 3, 4, 5}  to  the  parent  partition  node  (1),  will  be 
indexed  by  child  partitions,  namely  p3  and  p3. 

Dense  Subproblem  (1  G  AT):  Node  one  is  topmost  and  leftmost. 

Original  Variables:  xx  G  since  the  row  index  set  for  node  one  is  empty. 

Original  Data:  cf  =  ( cl  c3 ) . 

Incoming  Arcs:  Node  one  has  four  entering  arcs  with  sources  in  two  different 
child  partitions.  Arcs  (21)  and  (31)  are  from  p3  and  arcs  (41)  and  (51)  arc  from  p3. 
Their  types  arc  T\2  =  7i3  =  Tu  =  =  up. 

Added  Variables:  fa  6  fa  G  fa  G  fa  G  #«,  0n  €  &, 

G  8,  G  »*«,  and  /M  €  £*«. 

Added  Data:  1  is  indexed  by  (flw,Sj),  1  is  indexed  by  (0Pi,$\),  is  indexed 
by  (°n  ,/pj),  Ppj  is  indexed  by  (0W,/M),  Au  is  indexed  by  (fa,^),  Ai3  is  indexed 
by  (tfaiiW)  Ah  >s  indexed  by  (fa,lPi),  Xu  is  indexed  by  (fa, lP3),  -I12  is  indexed 
by  (^21.2=1)1  ~/i3  is  indexed  by  (fa,xi),  -Iu  is  indexed  by  (fa,xi)t  and  -Ils  is 
indexed  by  (fa,xi). 
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Non-negativity:  ln  >  0,  lPi  >  0,  and  z,  is  free. 

Constraint  Types:  02l,  03h  0si,  0W»  and  0W  arc  all  equalities. 
Formulation  (1  GyV): 


hi 

Si 

*1 

021 

Xn 

—hi 

= 

0 

031 

A',3 

“Il3 

= 

0 

041 

Xu 

~/u 

ss 

0 

0S1 

Xu 

“As 

ss 

0 

0n 

-T 

9n 

S5 

7~ 

On 

On 

s 

i 

>0 

>0 

free 

<r\ 

JU 

0 

cT 

ci 

Dense  Subproblem  (2  G  AT):  Node  two  is  leftmost  and  not  topmost. 

Original  Variables:  tt2  G  and  x2  6  3^. 

Original  Data:  /l2  =  (yl11 ),  62  =  (&>),  and  cj=  (0) . 

Incoming  Arcs:  There  are  two  arcs  entering  this  node,  (12)  and  (32),  and  their 
types  arc  7 ]2  =  down,  7^2  =  left.  Arc  (12)  spans  px  and  p2  but  it  is  down  so  no 
explicit  synchronization  is  necessary. 

Added  Variables:  A32  G  R*>,  u12  G  y32  €  w12  G  3?,  and  <32  <=  31. 

Added  Data:  is  indexed  by  (uJ2,x2),  -02,  is  indexed  by  (u12,s2),  S2l  is 

indexed  by  (u12,tw12),  6n  is  indexed  by  (<r2,u»13),  another  1  is  indexed  by  (o2,*32),  j23 
is  indexed  by  (A32,l32),  ff23  is  indexed  by  (A 32,y32),  and  -/23  is  indexed  by  (tt2,i/32). 


120 


APPENDIX  A.  EXAMPLE  SUBPROBLEM  FORMULATIONS 


Non-negativity:  z2,  y32,  uj12,  and  t32  arc  all  free. 
Constraint  Types:  A32  is  >,  vt2  is  >,  and  x2  is  —. 
Formulation  (2  €  Af): 


*2 

ysa 

IU\2 

^32 

S2 

A32 

fias 

723 

> 

0 

u,2 

> 

—021 

*2 

Ai 

—I?3 

= 

62 

free 

free 

free 

free 

<r2 

0 

1  0 

^21  | 

1 

Dense  Subproblem  (3  G  jV):  Node  three  is  neither  leftmost  nor  topmost. 

Original  Variables:  x3  €  and  z3  6  9?^. 

Original  Data:  /l3  =  ( An ) ,  63  =  ( 0 ) ,  and  cj  =  ( 0 ) . 

Incoming  Arcs:  There  arc  two  incoming  arcs  to  node  three,  (13)  and  (23),  and 
their  types  arc  Tj3  =  down  and  7ij  =  right. 

Added  Variables:  Uj3  G  3i,  wm  G  &,  tz23  G  X,  and  u>i3  G  X. 

Added  Data:  $ji  is  indexed  by  (ui3,z3),  -03l  is  indexed  by  (ui3,s3),  £3i  is 
indexed  by  (t>i3,  u>i3),  53 1  is  indexed  by  (<r3,ioi3),  yaa  is  indexed  by  (x3,t:23),  -f32 
is  indexed  by  (<r3,n23),  d32  is  indexed  by  (wai«n),  and  another  d32  is  indexed  by 
(t^a.ss)- 

Non-negativity:  ti23  >  0  z3  is  free,  and  toj3  is  free. 


Constraint  Types:  uj3  is  >,  tt3  is  >,  and  w23  is  =. 
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Formulation  (3  G  Af): 


Dense  Subproblem  (4  G  Af):  Similar  to  node  two,  this  node  is  leftmost  and  not 
topmost. 

Original  Variables:  *4  G  Si*4,  X*  €  3?”*. 

Original  Data:  A|  =  ( A71 ) ,  b4  —  ( b7 ) ,  and  c<  =  ( 0 ) . 

Incoming  Arcs:  There  are  two  arcs  entering  node  four,  and  their  types  arc 
7|4  =  down  and  754  left. 

Added  Variables:  A54  G  SR*1,  Uu  6  Si,  1/54  G  &***,  u>u  G  !R,  and  G  3i. 

Added  Data:  $41  is  indexed  by  (014,14),  -^41  is  indexed  by  (014,54),  641  is 
indexed  by  (014,1014),  is  indexed  by  (<r4,tui4),  another  1  is  indexed  by  (04,  *54),  743 
is  indexed  by  ^54,^4),  II43  is  indexed  by  (A54, s/54),  and  —I^3  is  indexed  by  (*4,1/54). 

Non-negativity:  X4  >  0,  and  y54, 1014,  and  *54  are  free. 

Constraint  Types:  A54  is  >,  014  is  >,  and  *4  is  =. 


122 


APPENDIX  A.  EXAMPLE  SUBPROBLEM  FORMULATIONS 


Formulation  (4  G  Af): 


*4 

ys4 

Wu 

*54 

34 

A$4 

n45 

745 

> 

0 

Ul4 

til 

hi 

> 

-04! 

*4 

A< 

~/4S 

= 

>0 

free 

free 

free 

<?4 

1  0 

0 

[nr 

1  1 

Dense  Subproblem  (5  €  M):  Similar  to  node  three,  node  five  is  neither  topmost 
nor  leftmost. 

Original  Variables:  *5  G  3?71*  and  x$  G  3^*. 

Original  Data:  A$  -  ( Au ) ,  65  =  ( 0 ) ,  and  cj  —  ( 0 ) . 

Incoming  Arcs:  There  are  two  incoming  arcs  to  node  three,  (15)  and  (45),  and 
their  types  arc  Tis  =  down  and  7. is  =  right. 

Added  Variables:  ui5  G  &,  W45  G  &,  G  &,  and  w\$  G  3i. 

Added  Data:  $Sj  is  indexed  by  (visits))  -05i  is  indexed  by  (t>i5,55),  $51  is 
indexed  by  (t>i5,tui5),  £51  is  indexed  by  (<rs, tw15),  &4  is  indexed  by  (xsjt^s),  -hi 
is  indexed  by  (05,^45),  J54  is  indexed  by  (1^45,1445),  and  another  J54  is  indexed  by 

(^45,55). 

Non-negativity:  U45  >  0  x$  >  0,  and  wis  is  free. 


Constraint  Types:  t>i5  is  >,  jt5  is  >,  and  W45  is  =. 
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Formulation  (5  €  Af): 


UAS  Xj  wxi 

ys4  M  _ 

>0  >0  fret 


0‘s  “  *32 


♦ 


Appendix  B 

The  Test  Problems 


For  each  problem  from  the  test  suite,  we  have  produced  a  bitmap  pattern  of  the 
nonzcrocs  in  the  constraint  matrix.  The  application  called  SparscDisplay  was  used 
with  the  consent  of  its  creator  Irv  Lustig. 

The  three  DIET  problems  were  created  by  the  author  from  an  example  in  Chvatal 
[ChvS3].  They  are  used  primarily  for  test  purposes  and  arc  quite  small  and  dense. 
The  next  group  of  problems  arc  from  the  standard  nctlib  set. 

GROW7,  GROW15,  and  GROW22  arc  of  unknown  nature  and  origin. 

STAIR  is  also  known  as  DINAMCO,  and  is  an  economic  model  of  Mexico  due  to 
Alan  Manne  [Man??]. 

PILOT4  is  an  early  version  of  a  U.S.  energy  economic  model  by  George  Dantzig  and 
Wesley  Winkler. 

Finally,  the  next  last  group  of  test  problems  was  first  documented  in  [HLSla],  and 
their  descriptions  are  paraphrased  here.  Further  references  are  available  in  the  cited 
publication. 

SG205  is  an  dynamic  multisector  development  planning  model. 

SCAGR7  and  SCAGR25  are  an  two  versions  (respectively  7-period  and  25-period) 
of  a  large  dairy  farm  expansion  planning  model. 
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SCRS8  is  a  technological  assessment  model  for  the  transition  from  fossil  to  renewable 
energy  resources  in  the  U.S. 

SCORPION  is  a  dynamic  energy  flow  model  developed  for  the  oil  sector  of  France. 

SCSDl,  SCSD6,  and  SCSD8  arc  sample  problems  in  the  minimal  weight  design 
of  multistage  trusses  under  a  single  loading  condition. 

SCFXMl,  SCFXM2,  and  SCFXM3  arc  a  production  scheduling  model  (origin 
unknown). 

SCTAP1,  SCTAP2,  and  SCTAP3  arc  problems  in  the  optimization  of  dynamic 
traffic  flow  where  congestion  is  modelled  explicitly  in  the  flow  equations. 


Figure  B.l:  Bitmap  of  DIET2  (magnification  =  |^). 


Figure  B.2:  Bitmap  of  DIET3  (magnification  =  7555). 


Figure  B.3:  Bitmap  of  DIET7  (magnification  =  fjjjjj). 


Figure  B.-i:  Bitmap  of  S 


Figure  B.5:  Bitmap  of  SCAGR7  (magnificat 
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Figure  B.6:  Bitmap  of  SCORPION  (magnification  = 


Figure  B.13:  Bitmap  of  SCRS8  (magnification  =  —jj 


Figure  B.14:  Bitmap  of  PIL0T4  (magnification 
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Figure  B.16:  Bitmap  of  GR0W15  (magnification  = 


Figure  B.17:  Bitmap  of  SCSD6  (magnification  =  ^55). 


Appendix  C 
Tables 

C.l  Constant  Number  of  Subproblems 

The  first  five  tables  arc  the  supporting  data  for  Figures  4.3,  and  4.6-4.9,  and  the 
follow  arc  descriptions  of  thccolumn  headings. 

NAME  The  name  of  a  problem  from  our  test  suite. 

n  The  number  of  nodes  in  the  communication  network. 
p  The  number  of  IBM  3090/600E  virtual  processors. 

ITN  The  total  number  of  simplex  method  iterations  executed  on  all  subproblcms. 
SLV  The  total  number  of  solves  for  all  subproblems. 

DCPU  The  epu  time  spent  for  input,  solution,  and  output  (micro-seconds). 

SCPU  The  epu  time  spent  for  solution  (micro-seconds). 

Work  The  epu  time  spent  forming  and  solving  subproblcms  (micro-seconds). 

SELP  the  solution  elapsed  time  (micro-seconds). 

OBJTRU  the  optimal  objective  value. 

Rat  the  ratio  of  SCPU/SELP. 

Eff  the  ratio  of  Rat/p. 

Spd  the  speedup  measured  as  the  ratio  of  the  smallest  serial  time  using  either  MINOS 
or  DECOMP  (p  =  1). 

Spin  the  percentage  of  solution  time  not  spent  forming  and  solving  subproblems: 
(SCPU-YVorkJ/SCPU. 
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Name 

n 

JL. 

Itn  DPiv 

Slv 

Tcpu 

Wrk  Cmch 

lime 

Objective 

Pwr 

i-rr 

Spd  Spin 

SCAUR? 

0 

1 

93 

10 

1 

402 

228 

228 

230 

*0.233 138982433 1D407 

0.99 

99* 

1.0 

03 

SCAGR7 

0 

1 

93 

10 

1 

407 

232 

232. 

235 

•0.23313898243310407 

0.99 

99* 

1.0 

0% 

SCAGR7 

7 

1 

452 

43 

1388 

8061 

7869 

6940 

8125 

•0.23310854689910407 

0.97 

97% 

0.0 

12% 

SCAGR7 

7 

1 

452 

43 

1888 

8127 

7937 

6996 

8201 

41.23310854689910407 

0.97 

97% 

0.0 

123 

SCAGR7 

7 

2 

482 

47 

233 

2058 

1800 

1410 

1170  *0.23313411411120*07 

1.54 

77% 

0.2 

223 

SCAGR7 

7 

2 

625 

50 

341 

26-16 

2386 

1956 

1484 

•0.23313064553610407 

1.61 

80% 

0.2 

183 

SCAGR7 

7 

2 

442 

39  3101 

13103 

12863 

10868 

7996  *0.23310064789810407 

1.61 

80% 

0.0 

163 

SCAGR7 

7 

3 

562 

41 

306 

2620 

2316 

1772 

1278  *0.23301750610660407 

1.81 

60% 

0.2 

233 

SCAGR7 

7 

3 

467 

45 

262 

2395 

2092 

1536 

1097  41.233 1057799207D407 

1.91 

64% 

0.2 

273 

SCAGR7 

7 

3 

414 

41 

1750 

9703 

9403 

6992 

5791 

•0.2331084668216D407 

1.62 

54% 

0.0 

263 

SCAGR7 

7 

4 

331 

47 

211 

2241 

1887 

1222 

1067  *0.233 1282252049D407 

1.77 

44% 

0.2 

353 

SCAGR7 

7 

4 

413 

42 

1717  12782 

12437 

6840 

8930 

•0.2322334978334D407 

1.39 

35% 

0.0 

453 

SCAGR7 

7 

4 

394 

48 

266 

2723 

2375 

1492 

1367  *0.2326167666179D407 

1.74 

43% 

0.2 

373 

SCORPION 

0 

l 

139 

61 

1 

1150 

784 

mm 

797 

0.18781248227380404 

0.98 

98% 

1.0 

03 

SCORPION 

0 

1 

139 

61 

1 

1150 

785 

785 

793 

0.18781248227380  404 

0.99 

99% 

1.0 

03 

SCORPION 

6 

1 

280 

126 

114 

2438 

2027 

1975 

2072 

0.1878I24822738D404 

0.98 

98% 

0.4 

33 

SCORPION 

6 

1 

2S0 

126 

114 

2437 

2026 

1974 

2060 

0.18781248227380404 

0.98 

98% 

0.4 

33 

SCORPION 

6 

2 

279 

125 

130 

4233 

3774 

2116 

2337 

0.I878124822738D404 

1.61 

81% 

0.3 

443 

SCORPION 

6 

2 

280 

126 

114 

4141 

3680 

2034 

2269 

0.1878124S22738D404 

1.62 

81% 

0.3 

453 

SCORPION 

6 

2 

280 

126 

114 

4179 

3718 

2058 

2139 

0.1878124822738D404 

1.74 

87% 

0.4 

453 

SCORPION 

6 

3 

278 

124 

132 

4467 

3959 

2149 

2334 

0.18781248227380404 

1.70 

57% 

0.3 

463 

SCORPION 

6 

3 

280 

126 

130 

4496 

3990 

2143 

2391 

0.1878I24822738D4O4 

1.67 

56% 

0.3 

463 

SCORPION 

6 

3 

280 

126 

116 

4331 

3824 

2061 

2216 

0.18781248227380404 

1.73 

58% 

0.4 

463 

SCORPION 

6 

4 

280 

126 

148 

5297 

4751 

2221 

3113 

0.1878124822738D4O4 

1.53 

38% 

0.3 

533 

SCORPION 

6 

4 

283 

129 

148 

5019 

4469 

2234 

2652 

0.J878124822738D404 

1.69 

42% 

0.3 

503 

SCORPION 

6 

4 

276 

125 

192 

5391 

4839 

2464 

3015 

0.18781248227380404 

1.60 

40% 

03 

493 

SCAGR25 

0 

l 

475 

116 

1 

4783 

4343 

4343 

4382  41.14753433060770408 

0.99 

99% 

1.0 

03 

SCAGR25 

0 

1 

475 

116 

1 

4768 

4332 

4332 

4378 

•0.1475343306077D408 

0.99 

99% 

1.0 

03 

SCAGR25 

3 

1 

1317 

71 

1002  10190 

9733 

9343 

9888 

■0.70346882227190407 

0.98 

98% 

0.4 

43 

SCAGR25 

3 

1 

1317 

71 

1002  10247 

9791 

9396 

9947 

*0.70346882227 19D407 

0.9S 

98% 

0.4 

43 

SCAGR25 

3 

2 

1113 

72 

1002  17059 

16554 

9711 

9909  41.703 4689473487D407 

1.67 

84% 

0.4 

413 

SCAGR25 

3 

2 

1120 

72 

1001 

16475 

15969 

9619 

9951 

•0.7034689472408D407 

1.60 

80% 

0.4 

403 

SCAGR25 

3 

2 

1128 

71 

1001 

16486 

15983 

9671 

10111 

•0.703465 1433827D407 

1.58 

79% 

0.4 

393 

SCAGR25 

3 

3 

1009 

76 

1002  18167 

17616 

9699 

10172  *0.7034651480718D407 

1.73 

58% 

0.4 

453 

SCAGR25 

3 

3 

1023 

76 

1001 

14344 

17762 

9761 

10447  -0.7034688227004D407 

1.70 

57% 

0.4 

453 

SCAGR25 

3 

3 

1012 

71 

1001 

18457 

17874 

9776 

10180  *0.70346514445650407 

1.76 

59% 

0.4 

453 

SCAGR25 

3 

4 

1024 

76 

1001  20497 

19865 

9819 

11813  41.703465 1435092D407 

1.68 

42% 

0.4 

513 

SCAGR25 

3 

4 

1058 

71 

1003  21407  20770 

9979 

12934  *0.7034651432544D407 

1.61 

40% 

0.3 

523 

SCAGR25 

3 

4 

1034 

69 

1001  21037  2C410 

9836 

12466  41.70346894741970407 

1.64 

41% 

0.4 

523 

SCTAPl 

0 

1 

354 

138 

1 

2292 

1888 

1888 

1926 

0.14122500000000404 

0.98 

98% 

0.8 

03 

SCTAP1 

0 

1 

354 

138 

1 

2301 

1897 

1897 

1914 

0.1412250000000D404 

0.99 

99% 

0.8 

SCTAPl 

10 

1 

513 

30 

163 

1987 

1531 

1448 

1576 

0. 14 1 22500000000404 

0.97 

97% 

1.0 

m 

SCTAPl 

10 

1 

513 

30 

163 

1995 

1539 

1454 

1577 

0.1412250000000D404 

0.98 

98% 

1.0 

63 

SCTAPl 

10 

2 

749 

58 

255 

3009 

2504 

2197 

1396 

0.141225 OOOOOOOD404 

1.79 

90% 

1.1 

123 

SCTAPl 

10 

2 

627 

29 

204 

2640 

2130 

1830 

1228 

0.1412250000000D404 

1.73 

87% 

1.3 

143 

SCTAPl 

10 

2 

589 

31 

232 

2690 

2184 

1892 

1219 

0.14 1225 OOOOOOOD404 

1.79 

90% 

1.3 

133 

SCTAPl 

10 

3 

640 

38 

212 

2839 

2291 

1887 

1086 

0. 14 1 2250000000D404 

2.11 

70% 

1.5 

183 

SCTAPl 

10 

3 

645 

33 

231 

2969 

2417 

2003 

1078 

0.14122500000000404 

2.24 

75% 

1.5 

173 

SCTAPl 

10 

3 

751 

43 

378 

3824 

3274 

2823 

1490 

0.1412TOOOOOOOD404 

2.20 

73% 

1.1 

143 

SCTAPl 

10 

4 

558 

30 

237 

2965 

2362 

1915 

1029 

0.1412250000000D404 

2.30 

57% 

1.5 

193 

SCTAPl 

10 

4 

719 

34 

331 

3806 

3207 

2615 

143? 

0.1412250000000D404 

2.24 

56% 

1.1 

183 

SCTAPl 

10 

4 

665 

28 

253 

3653 

3057 

2136 

1506 

0. 1<»1 22500000000404 

1.90 

48% 

1.0 

303 

Table  C.2:  Constant  number  of  subproblems. 
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APPENDIX  C.  TABLES 


Name  n  p  lln  DPiv  SW  Tenu  Wtk  Cmch  Time  Olijccdve  IVr  Kff  SpJ  Spin 


SCFXMI 

0 

1 

416 

13T 

1 

2901 

2468 

2468 

2482  0.IH41675902H35D+05 

0.99 

99% 

1.0 

0* 

SCFXMI 

0 

I 

416 

154 

1 

2896 

2466 

2466 

2483  0.18416759928350*05 

0.99 

99% 

1.0 

0% 

scfxmi 

4 

I 

2004 

518 

313 

6587 

6107 

5981 

6254  0.18416759028350*05 

0.98 

98% 

0.4 

2% 

SCI:XMI 

4 

I 

2004 

518 

313 

6575 

6095 

5967 

6175  0.134I675902835D+05 

0.99 

99% 

0.4 

2* 

scfxmi 

4 

2 

2023 

505 

332  11005 

10475 

6246 

6191  0.184 I675902835D+05 

1.69 

85% 

04 

4071 

SCI-XM 1 

4 

2 

2016 

510 

324  10947 

10415 

6189 

6193  0.184 1675902835D405 

1.68 

84% 

04 

4171 

SCFXMI 

4 

2 

2031 

513 

356  11299 

10765 

6410 

6223  0.184 1675902835D+05 

1.73 

86% 

04 

4075 

scrxMi 

4 

3 

2374 

571 

422  13956  13379 

7585 

7496  0.18416759028350*05 

1.78 

59% 

03 

439! 

SCFXMI 

4 

3 

2011 

508 

362  11872 

11299 

6-109 

6379  0. 1 84 16759028350*05 

1.77 

59% 

04 

4375 

SCFXMI 

4 

3 

1972 

500 

340  11636 

11062 

6163 

6339  0.1S41675902S35D+05 

1.75 

58% 

04 

447! 

SCFXMI 

4 

4 

2265 

527 

413 

14509 

13S90 

7254 

8093  0.1841675902835D+05 

1.72 

43% 

03 

4875 

SCFXMI 

4 

4 

1975 

491 

354 

12547 

11922 

6308 

7089  0.18416759028350*05 

1.68 

42% 

04 

4775 

SCFXMl 

4 

4 

I486 

403 

282 

9702 

9079 

4774 

5180  0.1845)024376970*05 

1.75 

44% 

05 

477! 

GR0W7 

0 

l 

190 

18 

1 

1477 

1099 

1099 

1115  -0.47787811814710*08 

0.99 

99% 

08 

0% 

GR0W7 

0 

l 

190 

18 

1 

1488 

1101 

1101 

mi  -0.4778781 181471D*08 

0.99 

99% 

08 

0% 

GR0W7 

7 

1 

185 

2 

97 

1289 

867 

814 

893  -0.47787811814710*08 

0.97 

97% 

1.0 

67! 

GR0W7 

7 

I 

185 

2 

97 

1300 

877 

821 

901  -0.4778781  !8I471D*08 

0.97 

97% 

1.0 

67! 

GR0W7 

7 

2 

208 

2 

144 

1804 

1331 

1110 

766  -0.47787811814710*08 

1.74 

87% 

1.2 

1775 

GR0W7 

7 

2 

218 

1 

115 

16S8 

1212 

987 

710  -0.477878 1181471 D+08 

1.71 

85% 

1.3 

197! 

GR0W7 

7 

2 

119 

2 

74 

1222 

748 

583 

458  -0.4778781 181471D*08 

1.63 

82% 

1.9 

227! 

GR0W7 

7 

3 

208 

2 

163 

2074 

1557 

1217 

800  -0.4778781 181471D+08 

1.95 

65% 

1.1 

227! 

GR0W7 

7 

3 

210 

2 

167 

2170 

1652 

1264 

861  -0.4778781 181471D*08 

1.92 

64% 

1.0 

237! 

GR0W7 

7 

3 

214 

3 

183 

2228 

1708 

1352 

890  -0.4778780S61421D+08 

1.92 

6-1% 

1.0 

2175 

GR0W7 

7 

4 

200 

2 

170 

2276 

1717 

1270 

896  -0.4778781 181471D+08 

1.92 

48% 

1.0 

267! 

GR0W7 

7 

4 

215 

2 

170 

2203 

1640 

1293 

796  -0.4778781 18U71D+08 

2.06 

52% 

1.1 

217! 

GR0W7 

7 

4 

195 

2 

166 

2417 

1854 

1228 

1091  -0.47787811814710+08 

1.70 

42% 

08 

347! 

SCSD1 

0 

I 

206 

182 

1 

1383 

903 

903 

909  0.86666666743330+01 

0.99 

99% 

1.0 

07! 

SCSDI 

0 

I 

206 

182 

1 

1381 

899 

899 

908  O.S666666674333D+OI 

0.99 

99% 

1.0 

075 

SCSDI 

3 

1 

857 

653 

31 

2430 

1901 

1882 

1922  0.86666666746500+01 

0.99 

99% 

05 

175 

SCSD1 

3 

I 

857 

653 

31 

2430 

1897 

1878 

1920  0.86666666746500+01 

0.99 

99% 

0.5 

175 

SCSD1 

3 

2 

935 

657 

49 

3272 

2689 

2101 

1438  0.8666666674333D+0I 

1.87 

93% 

0.6 

227! 

SCSD1 

3 

2 

961 

679 

51 

3329 

2743 

2164 

1458  O.S666666674333D+01 

1.88 

94% 

06 

217! 

SCSD1 

3 

2 

948 

671 

53 

3330 

2747 

2137 

1462  0.8666666674334D+01 

1.88 

94% 

0.6 

227! 

SCSDI 

3 

3 

677 

517 

25 

3001 

2374 

1476 

1127  0.8666666674333D+01 

2.11 

70% 

0.8 

3875 

SCSD1 

3 

3 

819 

628 

60 

3803 

3178 

1973 

1496  0.86666666743330+01 

2.12 

71% 

0.6 

387! 

SCSDI 

3 

3 

1060 

771 

62 

4383 

3757 

2442 

1723  0.8666666674334D+0I 

2.18 

73% 

0.5 

357! 

SCSDI 

3 

4 

1032 

744 

73 

4382 

3715 

2419 

1561  0.86666666743330+01 

2.38 

59% 

06 

3575 

SCSDI 

3 

4 

963 

713 

56 

4352 

3682 

2230 

1741  0.8666666674333D+01 

2.11 

53% 

0.5 

397! 

SCSDI 

3 

4 

589 

438 

42 

2914 

2241 

1390 

10-15  0.86666666743330+01 

2.14 

54% 

0.9 

387! 

0 

1 

473 

36 

1 

6728 

6196 

6196 

6271  -0.251266951 1930D+03 

0.99 

99% 

0.5 

STAIR 

0 

1 

473 

36 

1 

6693 

6166 

6166 

6224  -0.251266951 1930D+03 

0.99 

99% 

05 

075 

STAIR 

6 

1 

240 

12 

286 

3870 

3292 

3175 

3360  -0.2087999900000D+03 

0.98 

98% 

1.0 

475 

STAIR 

6 

I 

240 

12 

286 

3914 

3333 

3215 

3403  -0.2087999900000D+03 

0.98 

98% 

1.0 

47! 

STAIR 

6 

2 

228 

12 

301 

6542 

5909 

3424 

3453  -0.2087999900000D+03 

1.71 

86% 

1.0 

427! 

STAIR 

6 

2 

229 

12 

292 

6525 

5896 

3370 

3508  -0.20879999000000+03 

1.68 

84% 

1.0 

437! 

STAIR 

6 

2 

232 

12 

291 

6484 

5851 

3364 

3511  -0.20879999000000+03 

1.67 

83% 

1.0 

4375 

STAIR 

6 

3 

216 

12 

294 

6983 

6310 

3386 

3710  -0.2087999900000D+03 

1.70 

57% 

0.9 

469! 

STAIR 

6 

3 

228 

12 

288 

7066 

6393 

3310 

3723  -0.2087999900000D+03 

1.72 

57% 

0.9 

487! 

STAIR 

6 

3 

214 

12 

294 

7204 

6531 

3385 

3730  -0.2087999900000D+03 

1.75 

58% 

0.9 

487! 

STAIR 

6 

4 

214 

12 

290 

7752 

7023 

3376 

4035  -0.20879999000000+03 

1.74 

44% 

08 

527! 

STAIR 

6 

4 

214 

12 

289 

7484 

6761 

3362 

3808  -0.2087999900000D+03 

1.78 

44% 

0.9 

507! 

STAIR 

6 

4 

214 

12 

287 

7701 

6977 

3329 

4083  41.20879999000000+03 

1.71 

43% 

0.8 

527! 

Table  C.3:  Constant  number  of  subproblems. 
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C.l.  CONSTANT  NUMBER  OF  SUBPROBLEMS 


Name 

n 

JL 

Itn  DPiv 

Slv 

Tcpu 

Wile  Cmch 

Tirnc 

Objective 

Pwr 

F.ff 

Sptl  Spin 

"o 

1 

861 

319 

1 

9207 

8507 

850? 

8656 

0.901 2969538008 D + 03 

0.98 

98% 

0.3 

03 

SCRS8 

0 

J 

861 

319 

1 

9211 

8511 

8511 

8585 

0.9W2969538008D+03 

0.99 

99% 

0.3 

03 

SCRSS 

7 

1 

823 

235 

99 

3330 

2555 

2499 

2595 

0.9042969538C0SD+Q3 

0.98 

98% 

1.0 

23 

SCRS8 

7 

1 

823 

235 

99 

3316 

2544 

2490 

2583 

Q.9042969538008D+03 

0.98 

98% 

1.0 

23 

SCRS8 

7 

2 

788 

232 

131 

4766 

3934 

2644 

2182 

0.90I296953S008D+03 

1.80 

90% 

1.2 

333 

SCRS8 

7 

2 

787 

231 

130 

4720 

3893 

2615 

2116 

0.9012969538008D+03 

1.84 

92% 

1.2 

333 

SCKS8 

7 

2 

801 

233 

118 

4680 

3857 

2568 

2122 

0.9042969538008D+03 

1.82 

91% 

1.2 

333 

SCRS8 

7 

3 

805 

234 

145 

5047 

4180 

2759 

2123 

0.9042969538008D+03 

1.97 

66% 

1.2 

343 

SCRSS 

7 

3 

790 

233 

144 

5103 

4225 

2751 

2103 

0.90I296953800SD+03 

201 

67% 

1.2 

353 

SCRS8 

7 

3 

799 

234 

126 

4911 

4042 

2615 

2078 

0.90129695380081)103 

1.95 

65% 

1.2 

353 

SCRSS 

7 

4 

806 

235 

155 

5512 

4590 

2889 

2156 

090129695380080+03 

213 

53% 

1.2 

373 

SCRSS 

7 

4 

806 

234 

157 

5379 

4465 

2856 

2138 

09012969538008D+03 

209 

52% 

1.2 

363 

SCRSS 

7 

807 

235 

148 

5510 

4594 

2850 

2244 

0.90129695380080+03 

205 

51% 

1.2 

383 

l'ILOI'4 

0 

T 

3730 

1111 

1  51213  50-117  SIM  17  51028 

•0.25810166281370+04 

0.99 

99% 

0.3 

03 

PILOT-* 

0 

1 

3730 

1111 

1  51321  50535  50535  51127 

>0.25810166281370+01 

0.99 

99% 

0.3 

03 

PII.OT4 

4 

1 

2542 

178 

420  16777  15951 

15636  16277 

>0.73199050164800+12 

0.98 

98% 

1.0 

23 

PILOT4 

4 

1 

2542 

178 

420  1680?  15980 

15663  16596 

•0.73199050164800+12 

0.96 

96% 

1.0 

23 

PII.0T4 

4 

2 

3084 

1168 

407  Hm»  *****  8S612  90094 

•0.4643218961 1240+15 

1.86 

93% 

0.2 

473 

PII.OT4 

4 

2 

2345 

179 

423  29526  28616 

15825  16272 

•0.7319896083688D+I2 

1.76 

88% 

1.0 

453 

PII.OT4 

4 

2 

467 

161 

58 

5701 

4788 

3268 

2659 

•0.35509725278I0D+12 

1.80 

90% 

6.1 

323 

PILOT4 

4 

3 

829 

177 

154 

12601 

11676 

6378 

6009 

•0. 1 349535969926 D + 1 5 

1.94 

65% 

27 

453 

PII.OT4 

4 

3 

2334 

177 

422  32360  31379 

16217  17092 

•0.7319897I77463D+12 

1.84 

61% 

1.0 

483 

PII.OT4 

4 

3 

2334 

177 

422  31835  30884 

16064  17392  -0.7319897 177463D+-12 

1.78 

59% 

0.9 

483 

PII.OT4 

4 

4 

455 

161 

61 

6599 

5583 

3321 

2564  >0.17049465122740+13 

218 

54% 

6.3 

413 

PILOT4 

4 

4 

2329 

175 

422  33517  32501 

16089  18072 

>0.71684991667780+12 

1.80 

45% 

0.9 

EE 

PII.OT4 

4 

4 

455 

161 

58 

6778 

5801 

3323 

2979 

>0.30268797630310+13 

1.95 

49% 

5.5 

433 

SCFXM2 

0 

1 

833 

292 

1 

10257 

9439 

9439  11558 

0.36660261565000+05 

0.82 

82% 

0.8 

03 

SCI:XM2 

0 

1 

833 

292 

1 

10214 

9407 

9407 

9511 

0.36660261565000+05 

0.99 

99% 

1.0 

03 

SCFXM2 

8 

1 

5789 

991 

678  16627  15752 

15423  16273 

0.36660292498I5D+05 

0.97 

97% 

0.6 

23 

SCFXM2 

8 

1 

5789 

991 

678  16574 

15705 

15376  15951 

0.36660292498150+05 

0.98 

98% 

0.6 

23 

SCFXM2 

8 

2 

6220 

1062 

856  19118  18191 

17652 

9802 

0.3666026466566D+05 

1.86 

93% 

1.0 

33i 

SCFXM2 

8 

2 

6054 

1071 

814 

18534 

17603 

17083 

9117 

0.36660267907640+05 

1.93 

97% 

1.0 

33i 

SCFXM2 

8 

2 

6460 

1157 

930  20319  19397 

18459  10017 

0.36660261565000+05 

1.94 

97% 

0.9 

53 

SCFXM2 

8 

3 

5976 

1043 

927  20953 

19983 

17757 

7807 

0.3666026378162D+05 

256 

85% 

1.2 

113 

SCI:XM2 

8 

3 

6024 

1015 

871  20797  19819 

17550 

7396 

0.3666028795628D+05 

268 

89% 

1.3 
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Table  C.4:  Constant  number  of  subproblems. 
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Table  C.5:  Constant  number  of  subproblems. 
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Table  C.6:  Constant  number  of  subproblcms. 
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C.2  Varying  the  Number  of  Subproblems 


The  next  three  tables  support  Figures  4.10  -  4.12  and  following  arc  descriptions  of 

their  column  headings. 

NAME  The  name  of  a  problem  from  our  test  suite. 

N  The  number  of  subproblcms. 

DECOMP/1  The  amount  of  work,  time,  or  speedup  required  to  solve  SCSDS/Ar 
with  one  processor. 

DECOMP/2  The  amount  of  work,  time,  or  speedup  required  to  solve  SCSD8/JV 
with  two  processors. 

DECOMP/3  The  amount  of  work,  time,  or  speedup  required  to  solve  SCSDS/iV 
with  three  processors. 

DECOMP/4  The  amount  of  work,  time,  or  speedup  required  to  solve  SCSD8/iV 
with  four  processors. 

Work  is  in  CPU  seconds,  Time  is  in  seconds,  and  Speedup  is  dimensionless. 
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Table  C.7:  Work  for  a  varying  number  of  subproblems. 
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C.3  Varying  the  Number  of  Processors 


The  last  tabic  supports  Figures  4.13  and  4.14,  and  following  arc  descriptions  of  their 
column  headings. 

Problem  Name  The  format  is  Namc/iV/p,  where  Name  is  the  name  of  the  test 
problem,  N  is  the  number  of  subproblems,  and  p  is  the  number  of  processors. 

Total  Iterations  The  simplex  method  iterations  done  on  all  subproblcms. 

Degen  Iterations  The  number  of  degenerate  simplex  iterations  done  on  all  sub¬ 
problcms. 

Total  Solves  The  number  of  subproblcms  solved. 

Total  Work  The  number  of  CPU  seconds  used  for  the  entire  run. 

Power  The  effective  number  of  CPUs  applied  to  the  problem:  Work/Time. 

Work  The  number  of  CPU  seconds  used  to  form  and  solve  the  subproblcms. 

Time  The  Elapsed  seconds  needed  to  form  and  solve  the  subproblcms. 

Obective  Value  The  objective  value  obtained  for  the  overall  LP  problem. 
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Table  C.10:  Power,  Work  and  Time  for  a  varying  number  of  processors. 
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Abstract 


This  thesis  introduces  a  new  calculus  for  manipulating  linear-program  decomposition 
schemes.  A  linear  program  is  represented  by  a  communication  network ,  which  is 
decomposed  by  splitting  nodes  in  two,  and  a  transformation  is  defined  to  recover 
subproblcms  from  the  network.  We  also  define  a  dual-symmetric  oracle  that  provides 
solutions  to  linear  programs,  and  can  be  performed  by  the  simplex  method,  nested 
decomposition,  and  finally,  parallel  decomposition. 

Two  important  classes  of  linear  program  serve  as  examples  for  the  above  calculus: 
staircase  linear  programs  and  stochastic  linear  programs.  For  the  former  case,  a 
sophisticated  yet  experimental  computer  code  has  been  written  for  an  IBM  3090/600E 
with  six  processors.  The  code  performs  the  parallel  decomposition  algorithm  and  is 
tested  on  twenty-two  small  to  medium  sized  “real-world"  problems.  Experiments 
show  that  in  addition  to  spccdups  provided  by  decomposition  alone,  performance  is 
improved  by  using  parallel  processors. 


SECURITY  CLASSIFICATIOH  of  TUI#  PAOEfWHt.  1IA 


